+1 that looks clean.. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----Original Message----- From: <Ramirez>, "Paul M (398J)" <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Thursday, June 13, 2013 7:40 AM To: "[email protected]" <[email protected]> Subject: Re: Proposed Toolkit Refactoring >All, > >What about instead of Plotter we collocate the plots into a Display class >or module? > >Agree with Mike on the extra fluff around packages not adding and >therefore should be dropped. > > >rcmed = new RCMED(database_info) >obs = rcmed.loadObservation(key) >model = local.loadModel(filepath) >metric = [metrics.bias, metrics.pdf] >evaluation = new Evaluation(obs, model, metric) >results = evaluation.run() > >Maybe I'm simplifying this too much but wouldn't the following suffice? > >ocw >├── __init__.py >├── dataset.py >├── display.py >├── evaluate.py >├── io >│ ├── __init.py__ >│ ├── esg.py >│ ├── local.py >│ └── rcmed.py >└── metrics.py > > > >--Paul > > > >On 6/12/13 10:16 PM, "Kim, Jinwon" <[email protected]> wrote: > >> >>"Do terms like 'Dataset', 'Evaluation', 'Metric', 'Plotter' make good >>sense?" >>--> these can be taken as 'standard terminology' except 'plotter'. >> >>------------------------------------------------------------------------- >>- >>--------------------------- >>Jinwon Kim >>Dept. Atmospheric and Oceanic Sciences and >>Joint Institute for Regional Earth System Science and Engineering >>University of California, Los Angeles >>Los Angeles, CA 90095-1565 >>________________________________________ >>From: [email protected] [[email protected]] on behalf of Cameron >>Goodale [[email protected]] >>Sent: Wednesday, June 12, 2013 10:10 PM >>To: [email protected] >>Subject: Re: Proposed Toolkit Refactoring >> >>I have to agree with Mike on this one, but reserve the right to change my >>mind later ;) >> >>It is a delicate balance between organizing code that is maintainable, >>decoupled, and still retains an API that is easy for humans to read and >>understand. I will always favor direct and concise names over fuzzy or >>ambiguous ones. Naming is hard, period. >> >>I don't like the misc.Dataset.py that feels clunky and your don't get >>much >>more fuzzy than 'misc', so we should get that cleaned up. >> >>Thank you Mazi for creating the wiki page so we can all visually see the >>code structure, this is a big help to the project. >> >>I would like to invite any of the science users and/or devs to weigh in >>on >>this. If the resulting API doesn't make sense to the end users, then we >>have failed (in my opinion). >> >>Question for NON-Computer Scientists: >>Do terms like 'Dataset', 'Evaluation', 'Metric', 'Plotter' make good >>sense? >> >> >> >> >>-Cameron >> >> >>On Wed, Jun 12, 2013 at 4:43 PM, Michael Joyce <[email protected]> wrote: >> >>> I find the new structuring (A) to be a bit confusing. I think the part >>>I >>> find confusing is the 'data' part. Combining Dataset, DatasetProcessor, >>>and >>> DataSource into 'Data' seems to cause a bit of ambiguity. Let me see if >>>I >>> can give some examples. >>> >>> ocw.data.content is where I assume that Dataset is defined? The naming >>> doesn't really imply this to me. "content" is a bit ambiguous. >>> >>> ocw.data.processing is fine but personally I find ocw.dataset_processor >>>to >>> be more clear. Makes it seem like you have this object >>>(DatasetProcessor) >>> that let's you "process datasets". The first one says to me "O, I can >>> process data...what does that mean?". >>> >>> In my opinion, data_source.rcmed.getDataset() is more understandable >>>than >>> data.retrieve.rcmed.getDataset(). It makes it clear that 'rcmed' is a >>> datasource from which you can get a dataset. I think the second one >>>does >>> this as well but not as clearly as the first. This could certainly be >>>fixed >>> by changing some of the naming (Maybe data.sources.rcmed.getDataset()) >>>but >>> then why bother with the extra level of nesting if it's not doing >>> much/anything? >>> >>> Lastly, why have Plot.plotting.py? That extra directory >>> doesn't accomplish anything outside of adding another level of nesting. >>>If >>> we plan on adding more 'Plot' related modules then I would say go for >>>it, >>> but Plot.plotting seems unnecessarily redundant given that plotting.py >>>is >>> the only module in Plot. >>> -- >>> >>> I will say that I'm not completely sold on misc.dataset for defining >>>the >>> "Dataset" class. Other than that I prefer the structuring we came up >>>with >>> Monday over the new one. I don't feel that the new structuring helps >>>with >>> the Dataset problem enough to warrant the changes it makes elsewhere. >>> >>> >>> -- Joyce >>> >>> >>> On Wed, Jun 12, 2013 at 1:21 PM, Boustani, Maziyar (398F) < >>> [email protected]> wrote: >>> >>> > Hi All, >>> > >>> > Monday this week Cam, Mike and me had a 30 min talk about the >>>refactoring >>> > RCMES code and coming with a code structure. >>> > On the wiki page [1] there are two code structures , structure A and >>>B. >>> > Structure (B) was the one we came up with on Mondays talk. >>> > Since then I was trying to make some improvements on that and came up >>> with >>> > structure (A). >>> > Most of the improvements were on trying to make naming more easy >>> > understudying for user and a simpler structure. >>> > For example: >>> > (B) >>> (A) >>> > misc.Dataset = Data.content >>> > DataSource.local = Data.retrieve.local >>> > DatasetProcessor = Data.process >>> > >>> > Here are some "import" examples we can have with the new structure: >>> > import Data.content >>> > import Datat.process >>> > import Data.retrieve.local >>> > import Data.retrieve.rcmed >>> > >>> > The Review Board for these python codes will come up soon. >>> > >>> > Thoughts? >>> > >>> > Best, >>> > Mazi >>> > >>> > [1]: >>> > >>> >>>https://cwiki.apache.org/confluence/display/CLIMATE/Open+Climate+Workben >>>c >>>h+API+summary >>> > >>> > >>> > >>> > >>> > On Jun 6, 2013, at 9:31 AM, Boustani, Maziyar (398F) wrote: >>> > >>> > > Hi All, >>> > > >>> > > Regarding to the RCMES refactoring API codes (Toolkit), I thought >>>is >>> > good to have a wiki page that summarize the API we are going to have >>>for >>> > RCMES in future. >>> > > This is not the actual document we will have later for RCMES code, >>>but >>> > it just the list of classes, modules, methods and functions we may >>>need >>> to >>> > develop. >>> > > It would be great if you guys help me to complete this wiki before >>>we >>> > start the refactoring toolkit's codes. >>> > > >>> > > >>> > >>> >>>https://cwiki.apache.org/confluence/display/CLIMATE/Open+Climate+Workben >>>c >>>h+API+summary >>> > > >>> > > Best, >>> > > Mazi >>> > > >>> > > >>> > > On Jun 5, 2013, at 7:40 AM, Michael Joyce wrote: >>> > > >>> > >> +1 for cutting 0.1-incubating and starting these changes in 0.2. >>> > >> >>> > >> >>> > >> -- Joyce >>> > >> >>> > >> >>> > >> On Wed, Jun 5, 2013 at 7:19 AM, Mattmann, Chris A (398J) < >>> > >> [email protected]> wrote: >>> > >> >>> > >>> This sounds like a good path to proceed down to me. >>> > >>> >>> > >>> I would formulate the below into a set of JIRA issues, >>> > >>> then proceed by incrementally evolving the toolkit to >>> > >>> support this. >>> > >>> >>> > >>> The only catch is that many of these could potentially >>> > >>> be API back compat. Since we haven't really talked or >>> > >>> suggested about the impact of this; nee made a release, >>> > >>> it's certainly possible to do this in trunk. >>> > >>> >>> > >>> My suggestion though since trunk represents what we >>> > >>> all believe to be RCMET 2.0 API compat, we should probably >>> > >>> create a branch for this. Or, better yet: >>> > >>> >>> > >>> 1. Close out current JIRA issues for 0.1-incubating. >>> > >>> 2. Cut a 0.1-incubating RC/release process. >>> > >>> 3. Start to implement the below in 0.2-incubating. >>> > >>> >>> > >>> Thoughts? >>> > >>> >>> > >>> Cheers, >>> > >>> Chris >>> > >>> >>> > >>> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> Chris Mattmann, Ph.D. >>> > >>> Senior Computer Scientist >>> > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> > >>> Office: 171-266B, Mailstop: 171-246 >>> > >>> Email: [email protected] >>> > >>> WWW: http://sunset.usc.edu/~mattmann/ >>> > >>> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> Adjunct Assistant Professor, Computer Science Department >>> > >>> University of Southern California, Los Angeles, CA 90089 USA >>> > >>> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> -----Original Message----- >>> > >>> From: Michael Joyce <[email protected]> >>> > >>> Reply-To: "[email protected]" >>> > >>> <[email protected]> >>> > >>> Date: Wednesday, June 5, 2013 6:56 AM >>> > >>> To: dev <[email protected]> >>> > >>> Subject: Proposed Toolkit Refactoring >>> > >>> >>> > >>>> All, >>> > >>>> >>> > >>>> This is a brief rundown of a discussion that Paul, Cam, Mazi, >>>and I >>> > had >>> > >>>> yesterday regarding the current state of the toolkit and >>>proposed >>> > changes >>> > >>>> that we would like to discuss with the list. >>> > >>>> >>> > >>>> We discussed adding a number of objects that should help >>>simplify >>> > toolkit >>> > >>>> usage. Below is a high-level rundown of our discussion. >>> > >>>> >>> > >>>> -- >>> > >>>> >>> > >>>> Dataset: Simple container object for a dataset. Provides helpers >>>for >>> > >>>> accessing relevant data (getLatsLons, getTime) and convenience >>> > functions >>> > >>>> (writeToFile()). >>> > >>>> >>> > >>>> DataSource: Provides the user with helper functions for grabbing >>>the >>> > data >>> > >>>> that they want to evaluate. There's a RCMED module specifically >>>for >>> > >>>> grabbing RCMED data and a Local module for grabbing local data. >>>This >>> > could >>> > >>>> easily be expanded to include ESG and other data sources. >>> > >>>> >>> > >>>> DatasetProcessor: Any operation that needs to be run on datasets >>> (that >>> > >>>> isn't the evaluation obviously) is found in the >>>DatasetProcessor. It >>> > >>>> supports: >>> > >>>> - regridding (spatial and temporal) >>> > >>>> - masking/cleaning/filtering >>> > >>>> - subsetting (spatial and temporal) >>> > >>>> - ensemble generation >>> > >>>> - anything else that fits here. >>> > >>>> >>> > >>>> Evaluation: The Evaluation object is (surprise surprise) in >>>charge >>> of >>> > >>>> running Evaluations. It keeps track of the datasets (both >>> 'reference' >>> > and >>> > >>>> the 'targets') that the user wants to use in the evaluation. It >>>runs >>> > all >>> > >>>> the necessary evaluations and keeps the results nicely stored >>>and >>> > readily >>> > >>>> accessible for the user. >>> > >>>> >>> > >>>> Metric: Metrics are added to an Evaluation and used during the >>>run. >>> > All >>> > >>>> metrics inherit from the base Metric class. All you need to add >>>new >>> > >>>> metrics >>> > >>>> is inherit from Metric and override the 'run' method. >>> > >>>> >>> > >>>> Plotter: The Plotter makes result visualization a breeze. If you >>> give >>> > it >>> > >>>> an >>> > >>>> Evaluation object it will spit >>> > >>>> out plots of all the results. Give it a Dataset and it will spit >>> out a >>> > >>>> plot. You can even have it return >>> > >>>> Matplotlib objects so you can make your results look exactly the >>>way >>> > you'd >>> > >>>> like. >>> > >>>> >>> > >>>> -- Joyce >>> > >>> >>> > >>> >>> > > >>> > >>> > >>> >
