JingsongLi commented on a change in pull request #12321:
URL: https://github.com/apache/flink/pull/12321#discussion_r436469905
##########
File path: docs/dev/connectors/streamfile_sink.md
##########
@@ -204,6 +205,109 @@ input.addSink(sink)
</div>
</div>
+#### Avro format
+
+Flink also provides built-in support for writing data into Avro files. A list
of convenience methods to create
+Avro writer factories and their associated documentation can be found in the
+[AvroWriters]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/formats/avro/AvroWriters.html) class.
+
+To use the Avro writers in your application you need to add the following
dependency:
+
+{% highlight xml %}
+<dependency>
+ <groupId>org.apache.flink</groupId>
+ <artifactId>flink-avro</artifactId>
+ <version>{{ site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+A StreamingFileSink that writes data to Avro files can be created like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+import
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink;
+import org.apache.flink.formats.avro.AvroWriters;
+import org.apache.avro.Schema;
+
+
+Schema schema = ...;
+DataStream<GenericRecord> stream = ...;
+
+final StreamingFileSink<GenericRecord> sink = StreamingFileSink
+ .forBulkFormat(outputBasePath, AvroWriters.forGenericRecord(schema))
+ .build();
+
+input.addSink(sink);
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+import
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink
+import org.apache.flink.formats.avro.AvroWriters
+import org.apache.avro.Schema
+
+val schema: Schema = ...
+val input: DataStream[GenericRecord] = ...
+
+val sink: StreamingFileSink[GenericRecord] = StreamingFileSink
+ .forBulkFormat(outputBasePath, AvroWriters.forGenericRecord(schema))
+ .build()
+
+input.addSink(sink)
+
+{% endhighlight %}
+</div>
+</div>
+
+For creating customized Avro writers, e.g. enabling compression, users need to
create the `AvroWriterFactory`
+with a custom implementation of the [AvroBuilder]({{ site.javadocs_baseurl
}}/api/java/org/apache/flink/formats/avro/AvroBuilder.html) interface:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+AvroWriterFactory<?> factory = new AvroWriterFactory<>(new
AvroBuilder<Address>() {
+ @Override
+ public DataFileWriter<Address> createWriter(OutputStream out) throws
IOException {
Review comment:
Better to use lambda, because lambda is serializable, but inner class is
not.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]