[ 
https://issues.apache.org/jira/browse/CLIMATE-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on CLIMATE-575 started by Michael Joyce.
---------------------------------------------
> Implement initial config based execution of an evaluation
> ---------------------------------------------------------
>
>                 Key: CLIMATE-575
>                 URL: https://issues.apache.org/jira/browse/CLIMATE-575
>             Project: Apache Open Climate Workbench
>          Issue Type: Task
>          Components: general
>    Affects Versions: 0.5
>            Reporter: Michael Joyce
>            Assignee: Michael Joyce
>             Fix For: 1.0.0
>
>
> Brainstorming ideas for an initial config format for running an evaluation. I 
> have an idea of one below. Note that this doesn't necessarily encapsulate all 
> the functionality in the system yet. Empty sections are still a work in 
> progress and will be filled in when possible.
> ---
> At the moment, the assumption is that there will a single config file for one 
> evaluation.
>  h1. Sections
> There will be sections for
> * Datasets
> * Metrics
> * Plotting
> h2. Datasets
> Specified under a \[datasets\] tag. This will be where all the datasets that 
> will be loaded will be specified. A dataset will be specified with the 
> following format:
> eval_purpose_identifier: data_source_keyword data_source_locator_data 
> optional_keyword_args
> h3. eval_purpose_identifier
> Either "reference" or "target". If there are multiple target datasets in the 
> evaluation then they should all share the eval_purpose_identifier of "target"
> h3. data_source_keyword
> Specifies which data source will be used to load this dataset. At the current 
> state of the library the valid options would be "local", "dap", "rcmed", and 
> "esgf".
> h3. data_source_locator_data
> Data necessary for loading the dataset. This varies based on the data source 
> that will be used for loading this data. If you look at the docs for the data 
> sources, these are effectively the required elements for loading a dataset.
> h4. local data_source_locator_data
> There will be two parts for a local datasource. Each of these should be 
> separated by a space.
> * The path to the file to load (if it's a single file dataset) or the path to 
> the directory where multiple files are located, the accepted separator text 
> (tentatively "###"), and the glob pattern for the files to load.
> * The variable name
> h4. dap data_source_locator_data
> Each of these should be separated by a space.
> * OpenDAP URL
> * Variable name
> h4. rcmed data_source_locator_data
> Each of these should be separated by a space.
> * dataset_id
> * parameter_id
> * min_lat
> * max_lat
> * min_lon
> * max_lon
> * start_time
> * end_time
> h4. esgf data_source_locator_data
> Each of these should be separated by a space.
> * dataset_id
> * variable name
> * esgf username
> * esgf password
> h3. optional_keyword_args
> Any additional keyword args should be specified as a tuple after all of the 
> required values have been specified. Again, these should be separated by a 
> space from each other. Check the API docs for valid keyword args.
> h2. Metrics
> h2. Plotting
> ---
> Thoughts?
> A few of my concerns are:
> * Can we use whitespace to separate multiple items that we're passing and how 
> will we handle single elements which contain valid whitespace? For instance 
> file paths. If we place elements in quotes will that help with grouping? 
> Should we use a specific separator value to split everything?
> * How should we pass the time formats for RCMED datasets?
> * Can we pass keyword args as a tuple? Will this work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to