klion26 commented on a change in pull request #13486:
URL: https://github.com/apache/flink/pull/13486#discussion_r580814224
##########
File path: docs/dev/connectors/streamfile_sink.zh.md
##########
@@ -289,20 +289,18 @@ stream.addSink(StreamingFileSink.forBulkFormat(
</div>
</div>
-#### ORC Format
+#### ORC 格式
-To enable the data to be bulk encoded in ORC format, Flink offers
[OrcBulkWriterFactory]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/formats/orc/writers/OrcBulkWriterFactory.html)
-which takes a concrete implementation of [Vectorizer]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/orc/vector/Vectorizer.html).
+为了数据能够批量编码成 ORC 格式,Flink 提供具体实现了 [Vectorizer]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/orc/vector/Vectorizer.html)
+的 [OrcBulkWriterFactory]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/formats/orc/writers/OrcBulkWriterFactory.html) 实例。
-Like any other columnar format that encodes data in bulk fashion, Flink's
`OrcBulkWriter` writes the input elements in batches. It uses
-ORC's `VectorizedRowBatch` to achieve this.
+与其它批量编码列式存储格式一样,`OrcBulkWriter` 使用 ORC `VectorizedRowBatch` 实现输入数据批量写入功能。
-Since the input element has to be transformed to a `VectorizedRowBatch`, users
have to extend the abstract `Vectorizer`
-class and override the `vectorize(T element, VectorizedRowBatch batch)`
method. As you can see, the method provides an
-instance of `VectorizedRowBatch` to be used directly by the users so users
just have to write the logic to transform the
-input `element` to `ColumnVectors` and set them in the provided
`VectorizedRowBatch` instance.
+因为输入的数据需要转换成 `VectorizedRowBatch`,用户需要继承抽象类 `Vectorizer`
+同时实现 `vectorize(T element, VectorizedRowBatch batch)` 方法。正如你所看到的,这个方法提供用户一个
`VectorizedRowBatch` 实例,
+用户需要将输入的 `element` 转化成 `ColumnVectors`,同时设置在 `VectorizedRowBatch` 实例中。
-For example, if the input element is of type `Person` which looks like:
+我们看下输入数据类型为 `Person` 的一个例子,类型定义如下:
Review comment:
这句话有点口语化,能否优化下,类似 "以类型 `Person` 为例:“
##########
File path: docs/dev/connectors/streamfile_sink.zh.md
##########
@@ -289,20 +289,18 @@ stream.addSink(StreamingFileSink.forBulkFormat(
</div>
</div>
-#### ORC Format
+#### ORC 格式
-To enable the data to be bulk encoded in ORC format, Flink offers
[OrcBulkWriterFactory]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/formats/orc/writers/OrcBulkWriterFactory.html)
-which takes a concrete implementation of [Vectorizer]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/orc/vector/Vectorizer.html).
+为了数据能够批量编码成 ORC 格式,Flink 提供具体实现了 [Vectorizer]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/orc/vector/Vectorizer.html)
Review comment:
1. "Flink 提供具体xxx" 这句话读起来不太顺口,能否优化下
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]