+1 that looks clean..

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: <Ramirez>, "Paul M   (398J)" <[email protected]>
Reply-To: "[email protected]"
<[email protected]>
Date: Thursday, June 13, 2013 7:40 AM
To: "[email protected]" <[email protected]>
Subject: Re: Proposed Toolkit Refactoring

>All,
>
>What about instead of Plotter we collocate the plots into a Display class
>or module?
>
>Agree with Mike on the extra fluff around packages not adding and
>therefore should be dropped.
>
>
>rcmed = new RCMED(database_info)
>obs = rcmed.loadObservation(key)
>model = local.loadModel(filepath)
>metric = [metrics.bias, metrics.pdf]
>evaluation = new Evaluation(obs, model, metric)
>results = evaluation.run()
>
>Maybe I'm simplifying this too much but wouldn't the following suffice?
>
>ocw
>├── __init__.py
>├── dataset.py
>├── display.py
>├── evaluate.py
>├── io
>│   ├── __init.py__
>│   ├── esg.py
>│   ├── local.py
>│   └── rcmed.py
>└── metrics.py
>
>
>
>--Paul
>
>
>
>On 6/12/13 10:16 PM, "Kim, Jinwon" <[email protected]> wrote:
>
>>
>>"Do terms like 'Dataset', 'Evaluation', 'Metric', 'Plotter' make good
>>sense?"
>>--> these can be taken as 'standard terminology' except 'plotter'.
>>
>>-------------------------------------------------------------------------
>>-
>>---------------------------
>>Jinwon Kim
>>Dept. Atmospheric and Oceanic Sciences and
>>Joint Institute for Regional Earth System Science and Engineering
>>University of California, Los Angeles
>>Los Angeles, CA 90095-1565
>>________________________________________
>>From: [email protected] [[email protected]] on behalf of Cameron
>>Goodale [[email protected]]
>>Sent: Wednesday, June 12, 2013 10:10 PM
>>To: [email protected]
>>Subject: Re: Proposed Toolkit Refactoring
>>
>>I have to agree with Mike on this one, but reserve the right to change my
>>mind later ;)
>>
>>It is a delicate balance between organizing code that is maintainable,
>>decoupled, and still retains an API that is easy for humans to read and
>>understand.  I will always favor direct and concise names over fuzzy or
>>ambiguous ones.  Naming is hard, period.
>>
>>I don't like the misc.Dataset.py that feels clunky and your don't get
>>much
>>more fuzzy than 'misc', so we should get that cleaned up.
>>
>>Thank you Mazi for creating the wiki page so we can all visually see the
>>code structure, this is a big help to the project.
>>
>>I would like to invite any of the science users and/or devs to weigh in
>>on
>>this.  If the resulting API doesn't make sense to the end users, then we
>>have failed (in my opinion).
>>
>>Question for NON-Computer Scientists:
>>Do terms like 'Dataset', 'Evaluation', 'Metric', 'Plotter' make good
>>sense?
>>
>>
>>
>>
>>-Cameron
>>
>>
>>On Wed, Jun 12, 2013 at 4:43 PM, Michael Joyce <[email protected]> wrote:
>>
>>> I find the new structuring (A) to be a bit confusing. I think the part
>>>I
>>> find confusing is the 'data' part. Combining Dataset, DatasetProcessor,
>>>and
>>> DataSource into 'Data' seems to cause a bit of ambiguity. Let me see if
>>>I
>>> can give some examples.
>>>
>>> ocw.data.content is where I assume that Dataset is defined? The naming
>>> doesn't really imply this to me. "content" is a bit ambiguous.
>>>
>>> ocw.data.processing is fine but personally I find ocw.dataset_processor
>>>to
>>> be more clear. Makes it seem like you have this object
>>>(DatasetProcessor)
>>> that let's you "process datasets". The first one says to me "O, I can
>>> process data...what does that mean?".
>>>
>>> In my opinion, data_source.rcmed.getDataset() is more understandable
>>>than
>>> data.retrieve.rcmed.getDataset(). It makes it clear that 'rcmed' is a
>>> datasource from which you can get a dataset. I think the second one
>>>does
>>> this as well but not as clearly as the first. This could certainly be
>>>fixed
>>> by changing some of the naming (Maybe data.sources.rcmed.getDataset())
>>>but
>>> then why bother with the extra level of nesting if it's not doing
>>> much/anything?
>>>
>>> Lastly, why have Plot.plotting.py? That extra directory
>>> doesn't accomplish anything outside of adding another level of nesting.
>>>If
>>> we plan on adding more 'Plot' related modules then I would say go for
>>>it,
>>> but Plot.plotting seems unnecessarily redundant given that plotting.py
>>>is
>>> the only module in Plot.
>>> --
>>>
>>> I will say that I'm not completely sold on misc.dataset for defining
>>>the
>>> "Dataset" class. Other than that I prefer the structuring we came up
>>>with
>>> Monday over the new one. I don't feel that the new structuring helps
>>>with
>>> the Dataset problem enough to warrant the changes it makes elsewhere.
>>>
>>>
>>> -- Joyce
>>>
>>>
>>> On Wed, Jun 12, 2013 at 1:21 PM, Boustani, Maziyar (398F) <
>>> [email protected]> wrote:
>>>
>>> > Hi All,
>>> >
>>> > Monday this week Cam, Mike and me had a 30 min talk about the
>>>refactoring
>>> > RCMES code and coming with a code structure.
>>> > On the wiki page [1] there are two code structures , structure A and
>>>B.
>>> > Structure (B) was the one we came up with on Mondays talk.
>>> > Since then I was trying to make some improvements on that and came up
>>> with
>>> > structure (A).
>>> > Most of the improvements were on trying to make naming more easy
>>> > understudying for user and a simpler structure.
>>> > For example:
>>> >                                 (B)
>>> (A)
>>> >                         misc.Dataset              =     Data.content
>>> >                         DataSource.local   =    Data.retrieve.local
>>> >                         DatasetProcessor  =     Data.process
>>> >
>>> > Here are some "import" examples we can have with the new structure:
>>> >         import Data.content
>>> >         import Datat.process
>>> >         import Data.retrieve.local
>>> >         import Data.retrieve.rcmed
>>> >
>>> > The Review Board for these python codes will come up soon.
>>> >
>>> > Thoughts?
>>> >
>>> > Best,
>>> > Mazi
>>> >
>>> > [1]:
>>> >
>>> 
>>>https://cwiki.apache.org/confluence/display/CLIMATE/Open+Climate+Workben
>>>c
>>>h+API+summary
>>> >
>>> >
>>> >
>>> >
>>> > On Jun 6, 2013, at 9:31 AM, Boustani, Maziyar (398F) wrote:
>>> >
>>> > > Hi All,
>>> > >
>>> > > Regarding to the RCMES refactoring API codes (Toolkit), I thought
>>>is
>>> > good to have a wiki page that summarize the API we are going to have
>>>for
>>> > RCMES in future.
>>> > > This is not the actual document we will have later for RCMES code,
>>>but
>>> > it just the list of classes, modules, methods and functions we may
>>>need
>>> to
>>> > develop.
>>> > > It would be great if you guys help me to complete this wiki before
>>>we
>>> > start the refactoring toolkit's codes.
>>> > >
>>> > >
>>> >
>>> 
>>>https://cwiki.apache.org/confluence/display/CLIMATE/Open+Climate+Workben
>>>c
>>>h+API+summary
>>> > >
>>> > > Best,
>>> > > Mazi
>>> > >
>>> > >
>>> > > On Jun 5, 2013, at 7:40 AM, Michael Joyce wrote:
>>> > >
>>> > >> +1 for cutting 0.1-incubating and starting these changes in 0.2.
>>> > >>
>>> > >>
>>> > >> -- Joyce
>>> > >>
>>> > >>
>>> > >> On Wed, Jun 5, 2013 at 7:19 AM, Mattmann, Chris A (398J) <
>>> > >> [email protected]> wrote:
>>> > >>
>>> > >>> This sounds like a good path to proceed down to me.
>>> > >>>
>>> > >>> I would formulate the below into a set of JIRA issues,
>>> > >>> then proceed by incrementally evolving the toolkit to
>>> > >>> support this.
>>> > >>>
>>> > >>> The only catch is that many of these could potentially
>>> > >>> be API back compat. Since we haven't really talked or
>>> > >>> suggested about the impact of this; nee made a release,
>>> > >>> it's certainly possible to do this in trunk.
>>> > >>>
>>> > >>> My suggestion though since trunk represents what we
>>> > >>> all believe to be RCMET 2.0 API compat, we should probably
>>> > >>> create a branch for this. Or, better yet:
>>> > >>>
>>> > >>> 1. Close out current JIRA issues for 0.1-incubating.
>>> > >>> 2. Cut a 0.1-incubating RC/release process.
>>> > >>> 3. Start to implement the below in 0.2-incubating.
>>> > >>>
>>> > >>> Thoughts?
>>> > >>>
>>> > >>> Cheers,
>>> > >>> Chris
>>> > >>>
>>> > >>> 
>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> Chris Mattmann, Ph.D.
>>> > >>> Senior Computer Scientist
>>> > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> > >>> Office: 171-266B, Mailstop: 171-246
>>> > >>> Email: [email protected]
>>> > >>> WWW:  http://sunset.usc.edu/~mattmann/
>>> > >>> 
>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> Adjunct Assistant Professor, Computer Science Department
>>> > >>> University of Southern California, Los Angeles, CA 90089 USA
>>> > >>> 
>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> -----Original Message-----
>>> > >>> From: Michael Joyce <[email protected]>
>>> > >>> Reply-To: "[email protected]"
>>> > >>> <[email protected]>
>>> > >>> Date: Wednesday, June 5, 2013 6:56 AM
>>> > >>> To: dev <[email protected]>
>>> > >>> Subject: Proposed Toolkit Refactoring
>>> > >>>
>>> > >>>> All,
>>> > >>>>
>>> > >>>> This is a brief rundown of a discussion that Paul, Cam, Mazi,
>>>and I
>>> > had
>>> > >>>> yesterday regarding the current state of the toolkit and
>>>proposed
>>> > changes
>>> > >>>> that we would like to discuss with the list.
>>> > >>>>
>>> > >>>> We discussed adding a number of objects that should help
>>>simplify
>>> > toolkit
>>> > >>>> usage. Below is a high-level rundown of our discussion.
>>> > >>>>
>>> > >>>> --
>>> > >>>>
>>> > >>>> Dataset: Simple container object for a dataset. Provides helpers
>>>for
>>> > >>>> accessing relevant data (getLatsLons, getTime) and convenience
>>> > functions
>>> > >>>> (writeToFile()).
>>> > >>>>
>>> > >>>> DataSource: Provides the user with helper functions for grabbing
>>>the
>>> > data
>>> > >>>> that they want to evaluate. There's a RCMED module specifically
>>>for
>>> > >>>> grabbing RCMED data and a Local module for grabbing local data.
>>>This
>>> > could
>>> > >>>> easily be expanded to include ESG and other data sources.
>>> > >>>>
>>> > >>>> DatasetProcessor: Any operation that needs to be run on datasets
>>> (that
>>> > >>>> isn't the evaluation obviously) is found in the
>>>DatasetProcessor. It
>>> > >>>> supports:
>>> > >>>> - regridding (spatial and temporal)
>>> > >>>> - masking/cleaning/filtering
>>> > >>>> - subsetting (spatial and temporal)
>>> > >>>> - ensemble generation
>>> > >>>> - anything else that fits here.
>>> > >>>>
>>> > >>>> Evaluation: The Evaluation object is (surprise surprise) in
>>>charge
>>> of
>>> > >>>> running Evaluations. It keeps track of the datasets (both
>>> 'reference'
>>> > and
>>> > >>>> the 'targets') that the user wants to use in the evaluation. It
>>>runs
>>> > all
>>> > >>>> the necessary evaluations and keeps the results nicely stored
>>>and
>>> > readily
>>> > >>>> accessible for the user.
>>> > >>>>
>>> > >>>> Metric: Metrics are added to an Evaluation and used during the
>>>run.
>>> > All
>>> > >>>> metrics inherit from the base Metric class. All you need to add
>>>new
>>> > >>>> metrics
>>> > >>>> is inherit from Metric and override the 'run' method.
>>> > >>>>
>>> > >>>> Plotter: The Plotter makes result visualization a breeze. If you
>>> give
>>> > it
>>> > >>>> an
>>> > >>>> Evaluation object it will spit
>>> > >>>> out plots of all the results. Give it a Dataset and it will spit
>>> out a
>>> > >>>> plot. You can even have it return
>>> > >>>> Matplotlib objects so you can make your results look exactly the
>>>way
>>> > you'd
>>> > >>>> like.
>>> > >>>>
>>> > >>>> -- Joyce
>>> > >>>
>>> > >>>
>>> > >
>>> >
>>> >
>>>
>

Reply via email to