I find the new structuring (A) to be a bit confusing. I think the part I find confusing is the 'data' part. Combining Dataset, DatasetProcessor, and DataSource into 'Data' seems to cause a bit of ambiguity. Let me see if I can give some examples.
ocw.data.content is where I assume that Dataset is defined? The naming doesn't really imply this to me. "content" is a bit ambiguous. ocw.data.processing is fine but personally I find ocw.dataset_processor to be more clear. Makes it seem like you have this object (DatasetProcessor) that let's you "process datasets". The first one says to me "O, I can process data...what does that mean?". In my opinion, data_source.rcmed.getDataset() is more understandable than data.retrieve.rcmed.getDataset(). It makes it clear that 'rcmed' is a datasource from which you can get a dataset. I think the second one does this as well but not as clearly as the first. This could certainly be fixed by changing some of the naming (Maybe data.sources.rcmed.getDataset()) but then why bother with the extra level of nesting if it's not doing much/anything? Lastly, why have Plot.plotting.py? That extra directory doesn't accomplish anything outside of adding another level of nesting. If we plan on adding more 'Plot' related modules then I would say go for it, but Plot.plotting seems unnecessarily redundant given that plotting.py is the only module in Plot. -- I will say that I'm not completely sold on misc.dataset for defining the "Dataset" class. Other than that I prefer the structuring we came up with Monday over the new one. I don't feel that the new structuring helps with the Dataset problem enough to warrant the changes it makes elsewhere. -- Joyce On Wed, Jun 12, 2013 at 1:21 PM, Boustani, Maziyar (398F) < [email protected]> wrote: > Hi All, > > Monday this week Cam, Mike and me had a 30 min talk about the refactoring > RCMES code and coming with a code structure. > On the wiki page [1] there are two code structures , structure A and B. > Structure (B) was the one we came up with on Mondays talk. > Since then I was trying to make some improvements on that and came up with > structure (A). > Most of the improvements were on trying to make naming more easy > understudying for user and a simpler structure. > For example: > (B) (A) > misc.Dataset = Data.content > DataSource.local = Data.retrieve.local > DatasetProcessor = Data.process > > Here are some "import" examples we can have with the new structure: > import Data.content > import Datat.process > import Data.retrieve.local > import Data.retrieve.rcmed > > The Review Board for these python codes will come up soon. > > Thoughts? > > Best, > Mazi > > [1]: > https://cwiki.apache.org/confluence/display/CLIMATE/Open+Climate+Workbench+API+summary > > > > > On Jun 6, 2013, at 9:31 AM, Boustani, Maziyar (398F) wrote: > > > Hi All, > > > > Regarding to the RCMES refactoring API codes (Toolkit), I thought is > good to have a wiki page that summarize the API we are going to have for > RCMES in future. > > This is not the actual document we will have later for RCMES code, but > it just the list of classes, modules, methods and functions we may need to > develop. > > It would be great if you guys help me to complete this wiki before we > start the refactoring toolkit's codes. > > > > > https://cwiki.apache.org/confluence/display/CLIMATE/Open+Climate+Workbench+API+summary > > > > Best, > > Mazi > > > > > > On Jun 5, 2013, at 7:40 AM, Michael Joyce wrote: > > > >> +1 for cutting 0.1-incubating and starting these changes in 0.2. > >> > >> > >> -- Joyce > >> > >> > >> On Wed, Jun 5, 2013 at 7:19 AM, Mattmann, Chris A (398J) < > >> [email protected]> wrote: > >> > >>> This sounds like a good path to proceed down to me. > >>> > >>> I would formulate the below into a set of JIRA issues, > >>> then proceed by incrementally evolving the toolkit to > >>> support this. > >>> > >>> The only catch is that many of these could potentially > >>> be API back compat. Since we haven't really talked or > >>> suggested about the impact of this; nee made a release, > >>> it's certainly possible to do this in trunk. > >>> > >>> My suggestion though since trunk represents what we > >>> all believe to be RCMET 2.0 API compat, we should probably > >>> create a branch for this. Or, better yet: > >>> > >>> 1. Close out current JIRA issues for 0.1-incubating. > >>> 2. Cut a 0.1-incubating RC/release process. > >>> 3. Start to implement the below in 0.2-incubating. > >>> > >>> Thoughts? > >>> > >>> Cheers, > >>> Chris > >>> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Chris Mattmann, Ph.D. > >>> Senior Computer Scientist > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>> Office: 171-266B, Mailstop: 171-246 > >>> Email: [email protected] > >>> WWW: http://sunset.usc.edu/~mattmann/ > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Adjunct Assistant Professor, Computer Science Department > >>> University of Southern California, Los Angeles, CA 90089 USA > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> > >>> > >>> > >>> > >>> > >>> > >>> -----Original Message----- > >>> From: Michael Joyce <[email protected]> > >>> Reply-To: "[email protected]" > >>> <[email protected]> > >>> Date: Wednesday, June 5, 2013 6:56 AM > >>> To: dev <[email protected]> > >>> Subject: Proposed Toolkit Refactoring > >>> > >>>> All, > >>>> > >>>> This is a brief rundown of a discussion that Paul, Cam, Mazi, and I > had > >>>> yesterday regarding the current state of the toolkit and proposed > changes > >>>> that we would like to discuss with the list. > >>>> > >>>> We discussed adding a number of objects that should help simplify > toolkit > >>>> usage. Below is a high-level rundown of our discussion. > >>>> > >>>> -- > >>>> > >>>> Dataset: Simple container object for a dataset. Provides helpers for > >>>> accessing relevant data (getLatsLons, getTime) and convenience > functions > >>>> (writeToFile()). > >>>> > >>>> DataSource: Provides the user with helper functions for grabbing the > data > >>>> that they want to evaluate. There's a RCMED module specifically for > >>>> grabbing RCMED data and a Local module for grabbing local data. This > could > >>>> easily be expanded to include ESG and other data sources. > >>>> > >>>> DatasetProcessor: Any operation that needs to be run on datasets (that > >>>> isn't the evaluation obviously) is found in the DatasetProcessor. It > >>>> supports: > >>>> - regridding (spatial and temporal) > >>>> - masking/cleaning/filtering > >>>> - subsetting (spatial and temporal) > >>>> - ensemble generation > >>>> - anything else that fits here. > >>>> > >>>> Evaluation: The Evaluation object is (surprise surprise) in charge of > >>>> running Evaluations. It keeps track of the datasets (both 'reference' > and > >>>> the 'targets') that the user wants to use in the evaluation. It runs > all > >>>> the necessary evaluations and keeps the results nicely stored and > readily > >>>> accessible for the user. > >>>> > >>>> Metric: Metrics are added to an Evaluation and used during the run. > All > >>>> metrics inherit from the base Metric class. All you need to add new > >>>> metrics > >>>> is inherit from Metric and override the 'run' method. > >>>> > >>>> Plotter: The Plotter makes result visualization a breeze. If you give > it > >>>> an > >>>> Evaluation object it will spit > >>>> out plots of all the results. Give it a Dataset and it will spit out a > >>>> plot. You can even have it return > >>>> Matplotlib objects so you can make your results look exactly the way > you'd > >>>> like. > >>>> > >>>> -- Joyce > >>> > >>> > > > >
