klion26 commented on a change in pull request #12420:
URL: https://github.com/apache/flink/pull/12420#discussion_r443307501
##########
File path: docs/dev/table/streaming/joins.zh.md
##########
@@ -140,26 +138,22 @@ FROM
WHERE r.currency = o.currency
{% endhighlight %}
-Each record from the probe side will be joined with the version of the build
side table at the time of the correlated time attribute of the probe side
record.
-In order to support updates (overwrites) of previous values on the build side
table, the table must define a primary key.
+探针侧的每条记录都将与构建侧的表执行 Join 运算,构建侧的表中与探针侧对应时间属性的记录将参与运算。为了支持更新(包括覆盖)构建侧的表,该表必须定义主键。
-In our example, each record from `Orders` will be joined with the version of
`Rates` at time `o.rowtime`. The `currency` field has been defined as the
primary key of `Rates` before and is used to connect both tables in our
example. If the query were using a processing-time notion, a newly appended
order would always be joined with the most recent version of `Rates` when
executing the operation.
+在示例中,`Orders` 表中的每一条记录都与时间点 `o.rowtime` 的 `Rates` 进行 Join 运算。`currency`
字段已被定义为 `Rates` 表的主键,在示例中该字段也被用于连接两个表。如果该查询采用的是
processing-time,则在执行时新增的订单将始终与最新的 `Rates` 执行 Join。
-In contrast to [regular joins](#regular-joins), this means that if there is a
new record on the build side, it will not affect the previous results of the
join.
-This again allows Flink to limit the number of elements that must be kept in
the state.
+与[常规 Join](#regular-joins)相反,时态表函数 Join 意味着如果在构建侧新增一行记录将不会影响之前的结果。这同时使得 Flink
能够限制必须保存在 state 中的元素数量(因为不再需要保存之前的状态)。
Review comment:
```suggestion
与[常规 Join](#regular-joins) 相反,时态表函数 Join 意味着如果在构建侧新增一行记录将不会影响之前的结果。这同时使得
Flink 能够限制必须保存在 state 中的元素数量(因为不再需要保存之前的状态)。
```
##########
File path: docs/dev/table/streaming/joins.zh.md
##########
@@ -22,37 +22,38 @@ specific language governing permissions and limitations
under the License.
-->
-Joins are a common and well-understood operation in batch data processing to
connect the rows of two relations. However, the semantics of joins on [dynamic
tables](dynamic_tables.html) are much less obvious or even confusing.
+Join 在批数据处理中是比较常见且广为人知的运算,一般用于连接两张关系表。然而在[动态表](dynamic_tables.html)中 Join
的语义会难以理解甚至让人困惑。
-Because of that, there are a couple of ways to actually perform a join using
either Table API or SQL.
+因而,Flink 提供了几种基于 Table API 和 SQL 的 Join 方法。
-For more information regarding the syntax, please check the join sections in
[Table API](../tableApi.html#joins) and [SQL]({{ site.baseurl
}}/dev/table/sql/queries.html#joins).
+欲获取更多关于 Join 语法的细节,请参考 [Table API](../tableApi.html#joins) 和 [SQL]({{
site.baseurl }}/zh/dev/table/sql/queries.html#joins) 中的 Join 章节。
* This will be replaced by the TOC
{:toc}
-Regular Joins
+<a name="regular-joins"></a>
+
+常规 Join
-------------
-Regular joins are the most generic type of join in which any new records or
changes to either side of the join input are visible and are affecting the
whole join result.
-For example, if there is a new record on the left side, it will be joined with
all of the previous and future records on the right side.
+常规 Join 是最常用的 Join 用法。在常规 Join 中,任何新记录或对 Join 两侧的表的任何更改都是可见的,并会影响最终整个 Join
的结果。例如,如果 Join 左侧插入了一条新的记录,那么它将会与 Join 右侧过去与将来的所有记录进行 Join 运算。
Review comment:
`两侧的表的` -> `两侧表的` 会好一些吗?
##########
File path: docs/dev/table/streaming/joins.zh.md
##########
@@ -189,50 +183,43 @@ val result = orders
</div>
</div>
-**Note**: State retention defined in a [query
configuration](query_configuration.html) is not yet implemented for temporal
joins.
-This means that the required state to compute the query result might grow
infinitely depending on the number of distinct primary keys for the history
table.
+**注意**: 时态 Join中的 State 保留(在 [查询配置](query_configuration.html)
中定义)还未实现。这意味着计算的查询结果所需的状态可能会无限增长,具体数量取决于历史记录表的不重复主键个数。
+
+### 基于 Processing-time 时态 Join
-### Processing-time Temporal Joins
+如果将 processing-time 作为时间属性,将无法将 _past_ 时间属性作为参数传递给时态表函数。
+根据定义,processing-time 总会是当前时间戳。因此,基于 processing-time
的时态表函数将始终返回基础表的最新已知版本,时态表函数的调用将始终返回基础表的最新已知版本,并且基础历史表中的任何更新也将立即覆盖当前值。
-With a processing-time time attribute, it is impossible to pass _past_ time
attributes as an argument to the temporal table function.
-By definition, it is always the current timestamp. Thus, invocations of a
processing-time temporal table function will always return the latest known
versions of the underlying table
-and any updates in the underlying history table will also immediately
overwrite the current values.
+只有最新版本的构建侧记录(是否最新由所定义的主键所决定)会被保存在 state 中。
+构建侧的更新不会对之前 Join 的结果产生影响。
-Only the latest versions (with respect to the defined primary key) of the
build side records are kept in the state.
-Updates of the build side will have no effect on previously emitted join
results.
+可以将 processing-time 的时态 Join 视作简单的哈希Map `HashMap <K,V>`,HashMap 中存储来自构建侧的所有记录。
+当来自构建侧的新插入的记录与旧值具有相同的 Key 时,旧值会被覆盖。
+探针侧的每条记录将总会根据 `HashMap` 的最新/当前状态来计算。
-One can think about a processing-time temporal join as a simple `HashMap<K,
V>` that stores all of the records from the build side.
-When a new record from the build side has the same key as some previous
record, the old value is just simply overwritten.
-Every record from the probe side is always evaluated against the most
recent/current state of the `HashMap`.
+### 基于 Event-time 时态 Join
-### Event-time Temporal Joins
+将 event-time 作为时间属性时,可将 _past_ 时间属性作为参数传递给时态表函数。这允许对两个表中在相同时间点的记录执行 Join 操作。
-With an event-time time attribute (i.e., a rowtime attribute), it is possible
to pass _past_ time attributes to the temporal table function.
-This allows for joining the two tables at a common point in time.
+与基于 processing-time 的时态 Join 相比,时态表不仅将构建侧记录的最新版本(是否最新由所定义的主键所决定)保存在 state
中,同时也会存储自上一个 watermarks 以来的所有版本(按时间区分)。
-Compared to processing-time temporal joins, the temporal table does not only
keep the latest version (with respect to the defined primary key) of the build
side records in the state
-but stores all versions (identified by time) since the last watermark.
+例如,在探针侧表新插入一条 event-time 时间为 `12:30:00` 的记录,它将和构建侧表时间点为 `12:30:00`
的版本根据[时态表的概念](temporal_tables.html)进行 Join 运算。
+因此,新插入的记录仅与时间戳小于等于 `12:30:00` 的记录进行 Join 计算(由主键决定哪些时间点的数据将参与计算)。
-For example, an incoming row with an event-time timestamp of `12:30:00` that
is appended to the probe side table
-is joined with the version of the build side table at time `12:30:00`
according to the [concept of temporal tables](temporal_tables.html).
-Thus, the incoming row is only joined with rows that have a timestamp lower or
equal to `12:30:00` with
-applied updates according to the primary key until this point in time.
+通过定义事件时间(event time),[watermarks]({{ site.baseurl }}/zh/dev/event_time.html)
允许 Join 运算不断向前滚动,丢弃不再需要的构建侧快照。因为不再需要时间戳更低或相等的记录。
-By definition of event time, [watermarks]({{ site.baseurl
}}/dev/event_time.html) allow the join operation to move
-forward in time and discard versions of the build table that are no longer
necessary because no incoming row with
-lower or equal timestamp is expected.
+<a name="join-with-a-temporal-table"></a>
-Join with a Temporal Table
+时态表 Join
--------------------------
-A join with a temporal table joins an arbitrary table (left input/probe side)
with a temporal table (right input/build side),
-i.e., an external dimension table that changes over time. Please check the
corresponding page for more information about [temporal
tables](temporal_tables.html#temporal-table).
+时态表 Join 意味着对任意表(左输入/探针侧)和一个时态表(右输入/构建侧)执行的 Join
操作,即随时间变化的的扩展表。请参考相应的页面以获取更多有关[时态表](temporal_tables.html#temporal-table)的信息。
-<span class="label label-danger">Attention</span> Users can not use arbitrary
tables as a temporal table, but need to use a table backed by a
`LookupableTableSource`. A `LookupableTableSource` can only be used for
temporal join as a temporal table. See the page for more details about [how to
define
LookupableTableSource](../sourceSinks.html#defining-a-tablesource-with-lookupable).
+<span class="label label-danger">注意</span> 不是任何表都能用作时态表,能作为时态表的表必须实现接口
`LookupableTableSource`。接口 `LookupableTableSource` 的实例只能作为时态表用于时态 Join
。查看此页面获取更多关于[如何实现接口
`LookupableTableSource`](../sourceSinks.html#defining-a-tablesource-with-lookupable)
的详细内容。
Review comment:
这个地方的链接应该是变成了
`(../sourceSinks.html#defining-a-tablesource-for-lookupable) `
这个后续你也可以提一个 hotfix 的 pr 来修改英文版的链接
##########
File path: docs/dev/table/streaming/joins.zh.md
##########
@@ -189,50 +183,43 @@ val result = orders
</div>
</div>
-**Note**: State retention defined in a [query
configuration](query_configuration.html) is not yet implemented for temporal
joins.
-This means that the required state to compute the query result might grow
infinitely depending on the number of distinct primary keys for the history
table.
+**注意**: 时态 Join中的 State 保留(在 [查询配置](query_configuration.html)
中定义)还未实现。这意味着计算的查询结果所需的状态可能会无限增长,具体数量取决于历史记录表的不重复主键个数。
+
+### 基于 Processing-time 时态 Join
-### Processing-time Temporal Joins
+如果将 processing-time 作为时间属性,将无法将 _past_ 时间属性作为参数传递给时态表函数。
+根据定义,processing-time 总会是当前时间戳。因此,基于 processing-time
的时态表函数将始终返回基础表的最新已知版本,时态表函数的调用将始终返回基础表的最新已知版本,并且基础历史表中的任何更新也将立即覆盖当前值。
-With a processing-time time attribute, it is impossible to pass _past_ time
attributes as an argument to the temporal table function.
-By definition, it is always the current timestamp. Thus, invocations of a
processing-time temporal table function will always return the latest known
versions of the underlying table
-and any updates in the underlying history table will also immediately
overwrite the current values.
+只有最新版本的构建侧记录(是否最新由所定义的主键所决定)会被保存在 state 中。
+构建侧的更新不会对之前 Join 的结果产生影响。
-Only the latest versions (with respect to the defined primary key) of the
build side records are kept in the state.
-Updates of the build side will have no effect on previously emitted join
results.
+可以将 processing-time 的时态 Join 视作简单的哈希Map `HashMap <K,V>`,HashMap 中存储来自构建侧的所有记录。
+当来自构建侧的新插入的记录与旧值具有相同的 Key 时,旧值会被覆盖。
+探针侧的每条记录将总会根据 `HashMap` 的最新/当前状态来计算。
-One can think about a processing-time temporal join as a simple `HashMap<K,
V>` that stores all of the records from the build side.
-When a new record from the build side has the same key as some previous
record, the old value is just simply overwritten.
-Every record from the probe side is always evaluated against the most
recent/current state of the `HashMap`.
+### 基于 Event-time 时态 Join
-### Event-time Temporal Joins
+将 event-time 作为时间属性时,可将 _past_ 时间属性作为参数传递给时态表函数。这允许对两个表中在相同时间点的记录执行 Join 操作。
Review comment:
这里 `_past_` 如果翻译成 `过去` 之类的会更好一些吗?
##########
File path: docs/dev/table/streaming/joins.zh.md
##########
@@ -22,37 +22,38 @@ specific language governing permissions and limitations
under the License.
-->
-Joins are a common and well-understood operation in batch data processing to
connect the rows of two relations. However, the semantics of joins on [dynamic
tables](dynamic_tables.html) are much less obvious or even confusing.
+Join 在批数据处理中是比较常见且广为人知的运算,一般用于连接两张关系表。然而在[动态表](dynamic_tables.html)中 Join
的语义会难以理解甚至让人困惑。
-Because of that, there are a couple of ways to actually perform a join using
either Table API or SQL.
+因而,Flink 提供了几种基于 Table API 和 SQL 的 Join 方法。
-For more information regarding the syntax, please check the join sections in
[Table API](../tableApi.html#joins) and [SQL]({{ site.baseurl
}}/dev/table/sql/queries.html#joins).
+欲获取更多关于 Join 语法的细节,请参考 [Table API](../tableApi.html#joins) 和 [SQL]({{
site.baseurl }}/zh/dev/table/sql/queries.html#joins) 中的 Join 章节。
Review comment:
链接的跳转,最近[邮件列表](http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Reminder-Prefer-link-tag-in-documentation-td42362.html)
建议使用 `{%link dev/table/sql/queries.zh.md %}#joins` 这种形式来写链接了,麻烦再改一下吧
其他地方也修改一下吧~
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]