ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754576394
##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
~ under the License.
-->
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro
data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for
Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native
batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
-This Apache Druid extension enables Druid to ingest and understand the Apache
Avro data format. This extension provides
-two Avro Parsers for stream ingestion and Hadoop batch ingestion.
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser)
and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
is deprecated.
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro
OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
-Make sure to [include](../../development/extensions.md#loading-extensions)
`druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of
loaded extensions. See [Loading
extensions](../../development/extensions.md#loading-extensions).
-### Avro Types
+## Avro Types
-Druid supports most Avro types natively, there are however some exceptions
which are detailed here.
+Druid supports most Avro types natively. This section describes some
exceptions.
-#### Unions
+### Unions
Druid has two modes for supporting `union` types.
-The default mode will treat unions as a single value regardless of the type it
is populated with.
+The default mode treats unions as a single value regardless of the type of
data populating the union.
-If you wish to operate on each different member of a union however you can set
`extractUnionsByType` on the Avro parser in which case unions will be expanded
into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e
`int`, `string`
-* Complex named types are keyed by their names, this includes `record`,
`fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set
`extractUnionsByType` on the Avro parser. This configuration expands union
values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For
example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`,
`fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null
-This is safe because an Avro union can only contain a single member of each
unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types)
are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each
unnamed type and duplicates of the same named type are not allowed. For
example: only a single array is allowed, multiple records (or other named
types) are allowed as long as each has a unique name.
-The members can then be accessed using a
[flattenSpec](../../ingestion/data-formats.md#flattenspec) similar other nested
types.
+You can then access the members of the union with a
[flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for
other nested types.
-#### Binary types
-`bytes` and `fixed` Avro types will be returned by default as base64 encoded
strings unless the `binaryAsString` option is enabled on the Avro parser.
-This setting will decode these types as UTF-8 strings.
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings
by default. If you enable the `binaryAsString` option on the Avro parser, the
extension decodes these types as UTF-8 strings.
-#### Enums
-`enum` types will be returned as `string` of the enum symbol.
+### Enums
+The extension returns `enum` types as `string` of the enum symbol.
Review comment:
```suggestion
The extension returns `enum` types as `string` of the enum symbol.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]