I think option 3 is the best given the rationale that has been stated previously.
I can add a function to the dataset_processor module that will take in a single Dataset Object and a list of SubRegion Specifications (north, south, east, west, Name), and it could return a tuple of SubRegion objects with a length equal to the number of SubRegion Specs. SubClassing Dataset makes sense because a Dataset and SubRegion share common attributes, but after talking with Mike about the two, can a future science user please give me a clear difference between a Dataset and a SubRegion? I hope a SubRegion assumes specific Metrics to be run, that cannot be run on a Dataset. I fear if SubRegion and Dataset are too similar it will merely confuse users (and software engineers) about when to use which one. Can anyone articulate the difference between a Dataset and SubRegion for me? Thanks, Cameron On Mon, Jul 29, 2013 at 12:22 PM, Michael Joyce <[email protected]> wrote: > You covered most everything Alex. > > I'm a fan of inheriting from Dataset to handle Subregions. The user can > still add the "dataset" the same way to an Evaluation. Then the Evaluation > instance can run a separate eval loop to handle subregions. It makes > Evaluation more complicated but using naming convention to designate a > subregion will just be worse I feel. The DatasetProcessor could have a > function that takes a Dataset and subregion information and spits out a new > SubregionDataset (or some such meaningful name) instance that the user can > add to the Evaluation. > > What does everyone think would be a good way of handling this? > > > -- Joyce > > > On Mon, Jul 29, 2013 at 11:36 AM, Goodman, Alexander (398J-Affiliate) < > [email protected]> wrote: > > > Hi all, > > > > Being able to account for subregions will be a crucial part of running an > > evaluation and making the right plots as part of our OCW refactoring. > Mike > > and I had a discussion last Friday on some ways to do this and we both > > thought that the best approach would make use of the Dataset class > somehow. > > Some specific ideas we had include: > > > > 1) Designate datasets as subregional by convention. Specifically, this > > could be something like making a new dataset instance with the same name > as > > the parent dataset but with the subregion name appended to the end with a > > leading underscore (eg name_R01, name_R02). > > > > 2) Values for a particular subregion could placed in a list or dictionary > > as an attribute of Dataset. > > > > 3) Make a subclass of Dataset explicitly for subregions. > > > > In general, any approach will add an additional complication to some > > component of the new OCW code in that the evaluation results / datasets > > need to get grouped together by subregion. > > > > My preferred approach is (3) since it adds the least amount of > complication > > to the plotting. I particularly don't like (1) since enforcing a rule by > > convention would add restrictions to users on valid names for datasets, > for > > example a dataset name like 'TRMM_hourly_precip' would make it difficult > to > > incorporate subregions. > > > > Mike, my memory since our last meeting is a bit fuzzy so please clarify > or > > correct any of my points if I am wrong here. I would like to hear other > > ideas or opinions as to the best approach for the subregion problem. > > > > Thanks, > > Alex > > > > -- > > Alex Goodman > > >
