I think option 3 is the best given the rationale that has been stated
previously.

I can add a function to the dataset_processor module that will take in a
single Dataset Object and a list of SubRegion Specifications (north, south,
east, west, Name), and it could return a tuple of SubRegion objects with a
length equal to the number of SubRegion Specs.

SubClassing Dataset makes sense because a Dataset and SubRegion share
common attributes, but after talking with Mike about the two, can a future
science user please give me a clear difference between a Dataset and a
SubRegion?

I hope a SubRegion assumes specific Metrics to be run, that cannot be run
on a Dataset.  I fear if SubRegion and Dataset are too similar it will
merely confuse users (and software engineers) about when to use which one.

Can anyone articulate the difference between a Dataset and SubRegion for me?


Thanks,


Cameron


On Mon, Jul 29, 2013 at 12:22 PM, Michael Joyce <[email protected]> wrote:

> You covered most everything Alex.
>
> I'm a fan of inheriting from Dataset to handle Subregions. The user can
> still add the "dataset" the same way to an Evaluation. Then the Evaluation
> instance can run a separate eval loop to handle subregions. It makes
> Evaluation more complicated but using naming convention to designate a
> subregion will just be worse I feel. The DatasetProcessor could have a
> function that takes a Dataset and subregion information and spits out a new
> SubregionDataset (or some such meaningful name) instance that the user can
> add to the Evaluation.
>
> What does everyone think would be a good way of handling this?
>
>
> -- Joyce
>
>
> On Mon, Jul 29, 2013 at 11:36 AM, Goodman, Alexander (398J-Affiliate) <
> [email protected]> wrote:
>
> > Hi all,
> >
> > Being able to account for subregions will be a crucial part of running an
> > evaluation and making the right plots as part of our OCW refactoring.
> Mike
> > and I had a discussion last Friday on some ways to do this and we both
> > thought that the best approach would make use of the Dataset class
> somehow.
> > Some specific ideas we had include:
> >
> > 1) Designate datasets as subregional by convention. Specifically, this
> > could be something like making a new dataset instance with the same name
> as
> > the parent dataset but with the subregion name appended to the end with a
> > leading underscore (eg name_R01, name_R02).
> >
> > 2) Values for a particular subregion could placed in a list or dictionary
> > as an attribute of Dataset.
> >
> > 3) Make a subclass of Dataset explicitly for subregions.
> >
> > In general, any approach will add an additional complication to some
> > component of the new OCW code in that the evaluation results / datasets
> > need to get grouped together by subregion.
> >
> > My preferred approach is (3) since it adds the least amount of
> complication
> > to the plotting. I particularly don't like (1) since enforcing a rule by
> > convention would add restrictions to users on valid names for datasets,
> for
> > example a dataset name like 'TRMM_hourly_precip' would make it difficult
> to
> > incorporate subregions.
> >
> > Mike, my memory since our last meeting is a bit fuzzy so please clarify
> or
> > correct any of my points if I am wrong here. I would like to hear other
> > ideas or opinions as to the best approach for the subregion problem.
> >
> > Thanks,
> > Alex
> >
> > --
> > Alex Goodman
> >
>

Reply via email to