ivankelly commented on a change in pull request #2152: GCS offload support(4):
add documentations for GCS
URL: https://github.com/apache/incubator-pulsar/pull/2152#discussion_r205039932
##########
File path: site/docs/latest/cookbooks/tiered-storage.md
##########
@@ -17,42 +19,63 @@ A topic in Pulsar is backed by a log, known as a managed
ledger. This log is com
The Tiered Storage offloading mechanism takes advantage of this segment
oriented architecture. When offloading is requested, the segments of the log
are copied, one-by-one, to tiered storage. All segments of the log, apart from
the segment currently being written to can be offloaded.
-## Amazon S3
+On the broker, the administrator must configure the bucket or credentials for
the cloud storage service. The configured bucket must exist before attempting
to offload. If it does not exist, the offload operation will fail.
-Tiered storage currently supports S3 for long term storage. On the broker, the
administrator must configure a S3 bucket and the AWS region where the bucket
exists. Offloaded data will be placed into this bucket.
+Pulsar users multi-part objects to update the segment data. It is possible
that a broker could crash while uploading the data. We recommend you add a life
cycle rule your bucket to expire incomplete multi-part upload after a day or
two to avoid getting charged for incomplete uploads.
-The configured S3 bucket must exist before attempting to offload. If it does
not exist, the offload operation will fail.
+## Configuring for S3 and GCS in the broker
-Pulsar users multipart objects to update the segment data. It is possible that
a broker could crash while uploading the data. We recommend you add a lifecycle
rule your S3 bucket to expire incomplete multipart upload after a day or two to
avoid getting charged for incomplete uploads.
+Offloading is configured in ```broker.conf```.
-### Configuring the broker
+At a minimum, the administrator must configure the driver, the bucket and the
authenticating. There is also some other knobs to configure, like the bucket
regions, the max block size in backed storage, etc.
-Offloading is configured in ```broker.conf```.
+### Configure the driver
-At a minimum, the user must configure the driver, the region and the bucket.
+Currently we support driver of types: { "S3", "aws-s3", "google-cloud-storage"
},
+{% include admonition.html type="warning" content="The chars are case ignored
for driver's name. "s3" and "aws-s3" are similar, with "aws-s3" you just don't
need to define the url of the endpoint because it will know to use
`s3.amazonaws.com`." %}
```conf
managedLedgerOffloadDriver=S3
-s3ManagedLedgerOffloadRegion=eu-west-3
+```
+
+### Configure the Bucket
+
+On the broker, the administrator must configure the bucket or credentials for
the cloud storage service. The configured bucket must exist before attempting
to offload. If it does not exist, the offload operation will fail.
+
+- Regarding driver type "S3" or "aws-s3", the administrator should configure
`s3ManagedLedgerOffloadBucket`.
+
+```conf
s3ManagedLedgerOffloadBucket=pulsar-topic-offload
```
-It is also possible to specify the s3 endpoint directly, using
```s3ManagedLedgerOffloadServiceEndpoint```. This is useful if you are using a
non-AWS storage service which provides an S3 compatible API.
+- While regarding driver type "google-cloud-storage", the administrator should
configure `gcsManagedLedgerOffloadBucket`.
+```conf
+gcsManagedLedgerOffloadBucket=pulsar-topic-offload
+```
-{% include admonition.html type="warning" content="If the endpoint is
specified directly, then the region must _not_ be set." %}
+### Configure the Bucket Region
-{% include admonition.html type="warning" content="The broker.conf of all
brokers must have the same configuration for driver, region and bucket for
offload to avoid data becoming unavailable as topics move from one broker to
another." %}
+Bucket Region is the region where bucket located.
-Pulsar also provides some knobs to configure the size of requests sent to S3.
+Regarding AWS S3, the default region is `US East (N. Virginia)`. Page [AWS
Regions and
Endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html) contains
more information.
Review comment:
The user doesn't care if there is duplication. They care how much they need
to read to get up and running. And how much of what they read is useful to
them. With the current layout, 50% of what they read is useless to them.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services