This is an automated email from the ASF dual-hosted git repository. tloubrieu pushed a commit to branch ascending_latitudes in repository https://gitbox.apache.org/repos/asf/incubator-sdap-ingester.git
commit 528ba21d64d61607f358afee587b0f1b6e2843c3 Author: Eamon Ford <[email protected]> AuthorDate: Mon Dec 7 14:22:15 2020 -0800 SDAP-297: Update Collections Config docs to match latest schema (#26) --- collection_manager/README.md | 81 ++++++++++++++++++++++++++++++++------------ 1 file changed, 60 insertions(+), 21 deletions(-) diff --git a/collection_manager/README.md b/collection_manager/README.md index 84df468..90e72fa 100644 --- a/collection_manager/README.md +++ b/collection_manager/README.md @@ -26,7 +26,7 @@ From `incubator-sdap-ingester`, run: A path to a collections configuration file must be passed in to the Collection Manager at startup via the `--collections-path` parameter. Below is an example of what the -collections configuration file should look like: +collections configuration file could look like: ```yaml # collections.yaml @@ -34,35 +34,74 @@ collections configuration file should look like: collections: # The identifier for the dataset as it will appear in NEXUS. - - id: TELLUS_GRACE_MASCON_CRI_GRID_RL05_V2_LAND + - id: "CSR-RL06-Mascons_LAND" - # The local path to watch for NetCDF granule files to be associated with this dataset. - # Supports glob-style patterns. - path: /opt/data/grace/*land*.nc - - # The name of the NetCDF variable to read when ingesting granules into NEXUS for this dataset. - variable: lwe_thickness + # The path to watch for NetCDF granule files to be associated with this dataset. + # This can also be an S3 path prefix, for example "s3://my-bucket/path/to/granules/" + path: "/data/CSR-RL06-Mascons-land/" # An integer priority level to use when publishing messages to RabbitMQ for historical data. - # Higher number = higher priority. - priority: 1 + # Higher number = higher priority. Scale is 1-10. + priority: 1 # An integer priority level to use when publishing messages to RabbitMQ for forward-processing data. - # Higher number = higher priority. + # Higher number = higher priority. Scale is 1-10. forward-processing-priority: 5 - - id: TELLUS_GRACE_MASCON_CRI_GRID_RL05_V2_OCEAN - path: /opt/data/grace/*ocean*.nc - variable: lwe_thickness - priority: 2 - forward-processing-priority: 6 + # The type of project to use when processing granules in this collection. + # Accepted values are Grid, ECCO, TimeSeries, or Swath. + projection: Grid + + dimensionNames: + # The name of the primary variable + variable: lwe_thickness + + # The name of the latitude variable + latitude: lat + + # The name of the longitude variable + longitude: lon + + # The name of the depth variable (only include if depth variable exists) + depth: Z + + # The name of the time variable (only include if time variable exists) + time: Time + + # This section is an index of each dimension on which the primary variable is dependent, mapped to their desired slice sizes. + slices: + Z: 1 + Time: 1 + lat: 60 + lon: 60 + + - id: ocean-bottom-pressure + path: /data/OBP/ + priority: 6 + forward-processing-priority: 7 + projection: ECCO + dimensionNames: + latitude: YC + longitude: XC + time: time + # "tile" is required when using the ECCO projection. This refers to the name of the dimension containing the ECCO tile index. + tile: tile + variable: OBP + slices: + time: 1 + tile: 1 + i: 30 + j: 30 +``` + +Note that the dimensions listed under `slices` will not necessarily match the values of the properties under `dimensionNames`. This is because sometimes +the actual dimensions are referenced by index variables. - - id: AVHRR_OI-NCEI-L4-GLOB-v2.0 - path: /opt/data/avhrr/*.nc - variable: analysed_sst - priority: 1 +> **Tip:** An easy way to determine which variables go under `dimensionNames` and which ones go under `slices` is that the variables +> on which the primary variable is dependent should be listed under `slices`, and the variables on which _those_ variables are dependent +> (which could be themselves, as in the case of the first collection in the above example) should be the values of the properties under +> `dimensionNames`. The excepction to this is that `dimensionNames.variable` should always be the name of the primary variable. -``` ## Running the tests From `incubator-sdap-ingester/`, run:
