ccaominh commented on a change in pull request #8903: S3 input source
URL: https://github.com/apache/incubator-druid/pull/8903#discussion_r349759328
 
 

 ##########
 File path: docs/development/extensions-core/s3.md
 ##########
 @@ -98,6 +98,54 @@ You can enable [server-side 
encryption](https://docs.aws.amazon.com/AmazonS3/lat
 - kms: [Server-side encryption with AWS KMS–Managed 
Keys](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html)
 - custom: [Server-side encryption with Customer-Provided Encryption 
Keys](https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html)
 
+
+<a name="input-source"></a>
+
+## S3 batch ingestion input source
+
+This extension also provides an input source for Druid native batch ingestion 
to support reading objects directly from S3. Objects can be specified either 
via a list of S3 URI strings or a list of S3 location prefixes, which will 
attempt to list the contents and ingest all objects contained in the locations. 
The S3 input source is splittable and can be used by [native parallel index 
tasks](../../ingestion/native-batch.md#parallel-task), where each worker task 
of `index_parallel` will read a single object.
+
+Sample spec:
+
+```json
+...
+    "ioConfig": {
+      "type": "index_parallel",
+      "inputSource": {
+        "type": "s3",
+        "uris": ["s3://foo/bar/file.json", "s3://bar/foo/file2.json"]
+      },
+      "inputFormat": {
+        "type": "json"
+      },
+      ...
+    },
+...
+```
+
+```json
+...
+    "ioConfig": {
+      "type": "index_parallel",
+      "inputSource": {
+        "type": "s3",
+        "prefixes": ["s3://foo/bar", "s3://bar/foo"]
+      },
+      "inputFormat": {
+        "type": "json"
+      },
+      ...
+    },
+...
+```
+
+|property|description|default|required?|
+|--------|-----------|-------|---------|
+|type|This should be `s3`.|N/A|yes|
+|uris|JSON array of URIs where s3 files to be ingested are located.|N/A|`uris` 
or `prefixes` must be set|
+|prefixes|JSON array of URI prefixes for the locations of s3 files to be 
ingested.|N/A|`uris` or `prefixes` must be set|
+
 
 Review comment:
   With your latest changes, need to add another row for `objects` here and 
update the `required` value for the other columns based on the presence of 
`objects`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to