This is an automated email from the ASF dual-hosted git repository.
jark pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/master by this push:
new 80bea7a [FLINK-20163][docs-zh] Translate page "raw format" into
Chinese
80bea7a is described below
commit 80bea7a567f8b3b6a9ff3e59bc968dbdd5891b04
Author: Flora Tao <[email protected]>
AuthorDate: Tue Nov 17 12:15:42 2020 +0800
[FLINK-20163][docs-zh] Translate page "raw format" into Chinese
This closes #14075
---
docs/dev/table/connectors/formats/raw.zh.md | 84 ++++++++++++++---------------
1 file changed, 42 insertions(+), 42 deletions(-)
diff --git a/docs/dev/table/connectors/formats/raw.zh.md
b/docs/dev/table/connectors/formats/raw.zh.md
index 0d98d21..8280b6b 100644
--- a/docs/dev/table/connectors/formats/raw.zh.md
+++ b/docs/dev/table/connectors/formats/raw.zh.md
@@ -29,25 +29,25 @@ under the License.
* This will be replaced by the TOC
{:toc}
-The Raw format allows to read and write raw (byte based) values as a single
column.
+Raw format 允许读写原始(基于字节)值作为单个列。
-Note: this format encodes `null` values as `null` of `byte[]` type. This may
have limitation when used in `upsert-kafka`, because `upsert-kafka` treats
`null` values as a tombstone message (DELETE on the key). Therefore, we
recommend avoiding using `upsert-kafka` connector and the `raw` format as a
`value.format` if the field can have a `null` value.
+注意: 这种格式将 `null` 值编码成 `byte[]` 类型的 `null`。这样在 `upsert-kafka` 中使用时可能会有限制,因为
`upsert-kafka` 将 `null` 值视为 墓碑消息(在键上删除)。因此,如果该字段可能具有 `null` 值,我们建议避免使用
`upsert-kafka` 连接器和 `raw` format 作为 `value.format`。
-Dependencies
+依赖
------------
-The Raw format is a built-in format, so you don't need to add additional
dependency for projects and SQL Client.
+Raw format 是内置格式, 因此你无需为项目和 SQL Client 添加其他依赖。
-Example
+示例
----------------
-For example, you may have following raw log data in Kafka and want to read and
analyse such data using Flink SQL.
+例如,你可能在 Kafka 中具有原始日志数据,并希望使用 Flink SQL 读取和分析此类数据。
```
47.29.201.179 - - [28/Feb/2019:13:17:10 +0000] "GET /?p=1 HTTP/2.0" 200 5316
"https://domain.com/?p=1" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36" "2.75"
```
-The following creates a table where it reads from (and can writes to) the
underlying Kafka topic as an anonymous string value in UTF-8 encoding by using
`raw` format:
+下面的代码创建了一张表,使用 `raw` format 以 UTF-8 编码的形式从中读取(也可以写入)底层的 Kafka topic 作为匿名字符串值:
<div class="codetabs" markdown="1">
<div data-lang="SQL" markdown="1">
@@ -65,7 +65,7 @@ CREATE TABLE nginx_log (
</div>
</div>
-Then you can read out the raw data as a pure string, and split it into
multiple fields using an user-defined-function for further analysing, e.g.
`my_split` in the example.
+然后,你可以将原始数据读取为纯字符串,之后使用用户自定义函数将其分为多个字段进行进一步分析。例如 示例中的 `my_split`。
<div class="codetabs" markdown="1">
<div data-lang="SQL" markdown="1">
@@ -78,105 +78,105 @@ FROM(
</div>
</div>
-In contrast, you can also write a single column of STRING type into this Kafka
topic as an anonymous string value in UTF-8 encoding.
+相对应的,你也可以将一个 STRING 类型的列以 UTF-8 编码的匿名字符串值写入 Kafka topic。
-Format Options
+Format 参数
----------------
<table class="table table-bordered">
<thead>
<tr>
- <th class="text-left" style="width: 25%">Option</th>
- <th class="text-center" style="width: 8%">Required</th>
- <th class="text-center" style="width: 7%">Default</th>
- <th class="text-center" style="width: 10%">Type</th>
- <th class="text-center" style="width: 50%">Description</th>
+ <th class="text-left" style="width: 25%">参数</th>
+ <th class="text-center" style="width: 8%">是否必选</th>
+ <th class="text-center" style="width: 7%">默认值</th>
+ <th class="text-center" style="width: 10%">类型</th>
+ <th class="text-center" style="width: 50%">描述</th>
</tr>
</thead>
<tbody>
<tr>
<td><h5>format</h5></td>
- <td>required</td>
+ <td>必选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>String</td>
- <td>Specify what format to use, here should be 'raw'.</td>
+ <td>指定要使用的格式, 这里应该是 'raw'。</td>
</tr>
<tr>
<td><h5>raw.charset</h5></td>
- <td>optional</td>
+ <td>可选</td>
<td style="word-wrap: break-word;">UTF-8</td>
<td>String</td>
- <td>Specify the charset to encode the text string.</td>
+ <td>指定字符集来编码文本字符串。</td>
</tr>
<tr>
<td><h5>raw.endianness</h5></td>
- <td>optional</td>
+ <td>可选</td>
<td style="word-wrap: break-word;">big-endian</td>
<td>String</td>
- <td>Specify the endianness to encode the bytes of numeric value. Valid
values are 'big-endian' and 'little-endian'.
- See more details of <a
href="https://en.wikipedia.org/wiki/Endianness">endianness</a>.</td>
+ <td>指定字节序来编码数字值的字节。有效值为'big-endian'和'little-endian'。
+ 更多细节可查阅 <a href="https://zh.wikipedia.org/wiki/字节序">字节序</a>。</td>
</tr>
</tbody>
</table>
-Data Type Mapping
+数据类型映射
----------------
-The table below details the SQL types the format supports, including details
of the serializer and deserializer class for encoding and decoding.
+下表详细说明了这种格式支持的 SQL 类型,包括用于编码和解码的序列化类和反序列化类的详细信息。
<table class="table table-bordered">
<thead>
<tr>
- <th class="text-left">Flink SQL type</th>
- <th class="text-left">Value</th>
+ <th class="text-left">Flink SQL 类型</th>
+ <th class="text-left">值</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>CHAR / VARCHAR / STRING</code></td>
- <td>A UTF-8 (by default) encoded text string.<br>
- The encoding charset can be configured by 'raw.charset'.</td>
+ <td>UTF-8(默认)编码的文本字符串。<br>
+ 编码字符集可以通过 'raw.charset' 进行配置。</td>
</tr>
<tr>
<td><code>BINARY / VARBINARY / BYTES</code></td>
- <td>The sequence of bytes itself.</td>
+ <td>字节序列本身。</td>
</tr>
<tr>
<td><code>BOOLEAN</code></td>
- <td>A single byte to indicate boolean value, 0 means false, 1 means
true.</td>
+ <td>表示布尔值的单个字节,0表示 false, 1 表示 true。</td>
</tr>
<tr>
<td><code>TINYINT</code></td>
- <td>A single byte of the singed number value.</td>
+ <td>有符号数字值的单个字节。</td>
</tr>
<tr>
<td><code>SMALLINT</code></td>
- <td>Two bytes with big-endian (by default) encoding.<br>
- The endianness can be configured by 'raw.endianness'.</td>
+ <td>采用big-endian(默认)编码的两个字节。<br>
+ 字节序可以通过 'raw.endianness' 配置。</td>
</tr>
<tr>
<td><code>INT</code></td>
- <td>Four bytes with big-endian (by default) encoding.<br>
- The endianness can be configured by 'raw.endianness'.</td>
+ <td>采用 big-endian (默认)编码的四个字节。<br>
+ 字节序可以通过 'raw.endianness' 配置。</td>
</tr>
<tr>
<td><code>BIGINT</code></td>
- <td>Eight bytes with big-endian (by default) encoding.<br>
- The endianness can be configured by 'raw.endianness'.</td>
+ <td>采用 big-endian (默认)编码的八个字节。<br>
+ 字节序可以通过 'raw.endianness' 配置。</td>
</tr>
<tr>
<td><code>FLOAT</code></td>
- <td>Four bytes with IEEE 754 format and big-endian (by default)
encoding.<br>
- The endianness can be configured by 'raw.endianness'.</td>
+ <td>采用 IEEE 754 格式和 big-endian (默认)编码的四个字节。<br>
+ 字节序可以通过 'raw.endianness' 配置。</td>
</tr>
<tr>
<td><code>DOUBLE</code></td>
- <td>Eight bytes with IEEE 754 format and big-endian (by default)
encoding.<br>
- The endianness can be configured by 'raw.endianness'.</td>
+ <td>采用 IEEE 754 格式和 big-endian (默认)编码的八个字节。<br>
+ 字节序可以通过 'raw.endianness' 配置。</td>
</tr>
<tr>
<td><code>RAW</code></td>
- <td>The sequence of bytes serialized by the underlying TypeSerializer of
the RAW type.</td>
+ <td>通过 RAW 类型的底层 TypeSerializer 序列化的字节序列。</td>
</tr>
</tbody>
</table>