This is an automated email from the ASF dual-hosted git repository. eamonford pushed a commit to branch update-docs in repository https://gitbox.apache.org/repos/asf/incubator-sdap-ingester.git
commit ca3898e14aab479c5578b5f8e2a1a72183436c60 Author: Eamon Ford <[email protected]> AuthorDate: Mon Dec 7 13:58:59 2020 -0800 Update Collection Manager readme --- collection_manager/README.md | 80 ++++++++++++++++++++++++++++++++------------ 1 file changed, 59 insertions(+), 21 deletions(-) diff --git a/collection_manager/README.md b/collection_manager/README.md index 84df468..bc630cd 100644 --- a/collection_manager/README.md +++ b/collection_manager/README.md @@ -26,7 +26,7 @@ From `incubator-sdap-ingester`, run: A path to a collections configuration file must be passed in to the Collection Manager at startup via the `--collections-path` parameter. Below is an example of what the -collections configuration file should look like: +collections configuration file could look like: ```yaml # collections.yaml @@ -34,35 +34,73 @@ collections configuration file should look like: collections: # The identifier for the dataset as it will appear in NEXUS. - - id: TELLUS_GRACE_MASCON_CRI_GRID_RL05_V2_LAND + - id: "CSR-RL06-Mascons_LAND" - # The local path to watch for NetCDF granule files to be associated with this dataset. - # Supports glob-style patterns. - path: /opt/data/grace/*land*.nc - - # The name of the NetCDF variable to read when ingesting granules into NEXUS for this dataset. - variable: lwe_thickness + # The path to watch for NetCDF granule files to be associated with this dataset. + # This can also be an S3 path prefix, for example "s3://my-bucket/path/to/granules/" + path: "/data/CSR-RL06-Mascons-land/" # An integer priority level to use when publishing messages to RabbitMQ for historical data. - # Higher number = higher priority. - priority: 1 + # Higher number = higher priority. Scale is 1-10. + priority: 1 # An integer priority level to use when publishing messages to RabbitMQ for forward-processing data. - # Higher number = higher priority. + # Higher number = higher priority. Scale is 1-10. forward-processing-priority: 5 - - id: TELLUS_GRACE_MASCON_CRI_GRID_RL05_V2_OCEAN - path: /opt/data/grace/*ocean*.nc - variable: lwe_thickness - priority: 2 - forward-processing-priority: 6 + # The type of project to use when processing granules in this collection. + # Accepted values are Grid, ECCO, TimeSeries, or Swath. + projection: Grid + + dimensionNames: + # The name of the primary variable + variable: lwe_thickness + + # The name of the latitude variable + latitude: lat + + # The name of the longitude variable + longitude: lon + + # The name of the depth variable (only include if depth variable exists) + depth: Z + + # The name of the time variable (only include if time variable exists) + time: Time + + # This section is an index of each dimension on which the primary variable is dependent, mapped to their desired slice sizes. + slices: + Z: 1 + Time: 1 + lat: 60 + lon: 60 + + - id: ocean-bottom-pressure + path: /data/OBP/ + priority: 6 + forward-processing-priority: 7 + projection: ECCO + dimensionNames: + latitude: YC + longitude: XC + time: time + # "tile" is required when using the ECCO projection. This refers to the name of the dimension containing the ECCO tile index. + tile: tile + variable: OBP + slices: + time: 1 + tile: 1 + i: 30 + j: 30 +``` - - id: AVHRR_OI-NCEI-L4-GLOB-v2.0 - path: /opt/data/avhrr/*.nc - variable: analysed_sst - priority: 1 +Note that the dimensions listed under `slices` will not necessarily match those under `dimensionNames`. This is because sometimes +the actual dimensions are referenced by index variables. +> **Tip:** An easy way to determine which variables go under `dimensionNames` and which ones go under `slices` is that the variables +> on which the primary variable is dependent should go under `slices`, and the variables on which _those_ variables are dependent +> (which could be themselves, as in the case of the first collection in the above example) should go under `dimensionNames`. The excepction +> to this is that the primary variable is always listed under `dimensionNames.variable`. -``` ## Running the tests From `incubator-sdap-ingester/`, run:
