This is an automated email from the ASF dual-hosted git repository. rkk pushed a commit to branch SDAP-518-docs in repository https://gitbox.apache.org/repos/asf/sdap-nexus.git
commit 77cf856dbcd3089ba7056b89a8c1407fc8fd19f0 Author: rileykk <[email protected]> AuthorDate: Thu May 30 15:10:37 2024 -0700 Initial work on CC docs --- docs/collections.rst | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++ docs/index.rst | 6 +-- 2 files changed, 124 insertions(+), 5 deletions(-) diff --git a/docs/collections.rst b/docs/collections.rst new file mode 100644 index 0000000..c39f676 --- /dev/null +++ b/docs/collections.rst @@ -0,0 +1,123 @@ +.. _collections: + +******* +Collection Config Guide +******* + +Introduction +============ + +The Collection Config is a configuration file that defines collections to be ingested and maintained in SDAP. Currently, +it supports defining collections of NetCDF data that will be processed into the custom NEXUS protobuf tile format or gridded +Zarr data which can be used by SDAP directly with no need for processing. SDAP Ingester currently supports source data stored +in AWS S3 or on the local filesystem (currently, however, not both at the same time). + +This guide will explain how to set up both protobuf and Zarr collections. + +.. _collections-basics: + +Basic Structure +========== + +The Collection Config is a YAML file containing a single list named ``collections``: + +.. code-block:: + + collections: [] + +The items in this list are the collections defined and they have the basic structure: + +.. code-block:: + + - id: <single variable collection name> + path: <root collection location. Local path or S3 URI> + priority: <queue priority> + projection: <Grid | Swath> + dimensionNames: + latitude: <name of the latitude coordinate in the data> + longitude: <name of the longitude coordinate in the data> + time: <name of the time coordinate in the data> + variable: <variable name> + - id: <multi variable collection name> + path: <root collection location. Local path or S3 URI> + priority: <queue priority> + projection: <GridMulti | SwathMulti> + dimensionNames: + latitude: <name of the latitude coordinate in the data> + longitude: <name of the longitude coordinate in the data> + time: <name of the time coordinate in the data> + variables: + - <variable name 1> + - <variable name 2> + - <variable name 3> + +There are slight variations and additions to this structure depending on the type of collection, which will be covered below. + +.. _collections-nc: + +NetCDF - Protobuf Collections +======= + +TBA + +.. _collections-zarr: + +Zarr Collections +==== + +To specify a collection as a Zarr collection, simply add ``storeType: zarr`` to the collection object. If the data is local, +this is all you need to do. + +.. code-block:: + + id: <collection name> + path: <root collection location. Local path> + priority: <queue priority> + projection: <Grid | GridMulti> + storeType: zarr + dimensionNames: + latitude: <name of the latitude coordinate in the data> + longitude: <name of the longitude coordinate in the data> + time: <name of the time coordinate in the data> + variable: <variable name> + +For data in S3, you need to provide information on how to access the data. This is currently done with the ``config.aws`` object. + +You will need to provide credentials to access the bucket, or specify if it is public: + +Example: + +.. code-block:: + + collections: + - id: MUR_SST + path: s3://mur-sst/zarr-v1/ + priority: 1 + projection: Grid + storeType: zarr + dimensionNames: + latitude: lat + longitude: lon + time: time + variable: analysed_sst + config: + aws: + public: true + - id: private_data + path: s3://example-bucket/zarr/path/ + priority: 1 + projection: GridMulti + storeType: zarr + dimensionNames: + latitude: lat + longitude: lon + time: time + variables: + - var1 + - var2 + - var3 + config: + aws: + accessKeyID: <secret> + secretAccessKey: <secret> + public: false diff --git a/docs/index.rst b/docs/index.rst index 649d0e7..b781f6b 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,10 +1,6 @@ -Welcome to incubator-sdap-nexus's documentation! +Welcome to the Apache SDAP project documentation! ================================================ -.. warning:: - - Apache incubator-sdap-nexus is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the name of Apache TLP sponsor. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does [...] - .. toctree:: :maxdepth: 2 :caption: Contents:
