techdocsmith commented on a change in pull request #11153:
URL: https://github.com/apache/druid/pull/11153#discussion_r619340293
##########
File path: docs/ingestion/native-batch.md
##########
@@ -1004,10 +1004,8 @@ Google Cloud Storage object:
> You need to include the
> [`druid-azure-extensions`](../development/extensions-core/azure.md) as an
> extension to use the Azure input source.
-The Azure input source is to support reading objects directly from Azure Blob
store. Objects can be
-specified as list of Azure Blob store URI strings. The Azure input source is
splittable and can be used
-by the [Parallel task](#parallel-task), where each worker task of
`index_parallel` will read
-a single object.
+The Azure input source is used to read objects directly from Azure Blob store
or Azure Data Lake sources. Objects can be
Review comment:
```suggestion
The Azure input source reads objects directly from Azure Blob store or Azure
Data Lake sources. You can
```
nit
##########
File path: docs/ingestion/native-batch.md
##########
@@ -1004,10 +1004,8 @@ Google Cloud Storage object:
> You need to include the
> [`druid-azure-extensions`](../development/extensions-core/azure.md) as an
> extension to use the Azure input source.
-The Azure input source is to support reading objects directly from Azure Blob
store. Objects can be
-specified as list of Azure Blob store URI strings. The Azure input source is
splittable and can be used
-by the [Parallel task](#parallel-task), where each worker task of
`index_parallel` will read
-a single object.
+The Azure input source is used to read objects directly from Azure Blob store
or Azure Data Lake sources. Objects can be
+specified as a list of file URI strings or prefixes. The Azure input source is
splittable and can be used by the [Parallel task](#parallel-task), where each
worker task reads a single object.
Review comment:
```suggestion
specify objects as a list of file URI strings or prefixes. You can split the
Azure input source for use with [Parallel task](#parallel-task) indexing and
each worker task reads one chunk of the split data.
```
I think we should differentiate between the `single object` and the sections
of split out object since we're using `object` as the whole.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]