Awesome, please add yourself as a mentor. I’d appreciate it! ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----Original Message----- From: James Carman <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Monday, April 6, 2015 at 10:05 AM To: "[email protected]" <[email protected]> Subject: Re: [PROPOSAL] Climate Model Diagnostic Analyzer >I would love to help out. I don't know much about the problem domain, >but I am a "sciency" kind of guy. > > >On Mon, Apr 6, 2015 at 12:30 PM, Mattmann, Chris A (3980) ><[email protected]> wrote: >> :) you volunteering as a mentor? Could use you help! >> >> Sent from my iPhone >> >>> On Apr 6, 2015, at 9:18 AM, James Carman <[email protected]> >>>wrote: >>> >>> Apache Camdan? >>> >>> On Monday, March 23, 2015, Mattmann, Chris A (3980) < >>> [email protected]> wrote: >>> >>>> Hi Everyone, >>>> >>>> I am pleased to submit for consideration to the Apache Incubator >>>> the Climate Model Diagnostic Analyzer proposal. We are actively >>>> soliciting interested mentors in this project related to climate >>>> science and analytics and big data. >>>> >>>> Please find the wiki text of the proposal below and the link up >>>> on the wiki here: >>>> >>>> >>>>https://wiki.apache.org/incubator/ClimateModelDiagnosticAnalyzerProposa >>>>l >>>> >>>> Thank you for your consideration! >>>> >>>> Cheers, >>>> Chris >>>> (on behalf of the Climate Model Diagnostic Analyzer community) >>>> >>>> = Apache ClimateModelDiagnosticAnalyzer Proposal = >>>> >>>> == Abstract == >>>> >>>> The Climate Model Diagnostic Analyzer (CMDA) provides web services for >>>> multi-aspect physics-based and phenomenon-oriented climate model >>>> performance evaluation and diagnosis through the comprehensive and >>>> synergistic use of multiple observational data, reanalysis data, and >>>>model >>>> outputs. >>>> >>>> == Proposal == >>>> >>>> The proposed web-based tools let users display, analyze, and download >>>> earth science data interactively. These tools help scientists quickly >>>> examine data to identify specific features, e.g., trends, geographical >>>> distributions, etc., and determine whether a further study is needed. >>>>All >>>> of the tools are designed and implemented to be general so that data >>>>from >>>> models, observation, and reanalysis are processed and displayed in a >>>> unified way to facilitate fair comparisons. The services prepare and >>>> display data as a colored map or an X-Y plot and allow users to >>>>download >>>> the analyzed data. Basic visual capabilities include 1) displaying >>>> two-dimensional variable as a map, zonal mean, and time series 2) >>>> displaying three-dimensional variable’s zonal mean, a two-dimensional >>>> slice at a specific altitude, and a vertical profile. General >>>>analysis can >>>> be done using the difference, scatter plot, and conditional sampling >>>> services. All the tools support display options for using linear or >>>> logarithmic scales and allow users to specify a temporal range and >>>>months >>>> in a year. The source/input datasets for these tools are CMIP5 model >>>> outputs, Obs4MIP observational datasets, and ECMWF reanalysis >>>>datasets. >>>> They are stored on the server and are selectable by a user through >>>>the web >>>> services. >>>> >>>> === Service descriptions === >>>> >>>> 1. '''Two dimensional variable services''' >>>> >>>> * Map of two-dimensional variable: This services displays a two >>>> dimensional variable as a colored longitude and latitude map with >>>>values >>>> represented by a color scheme. Longitude and latitude ranges can be >>>> specified to magnify a specific region. >>>> >>>> * Two dimensional variable zonal mean: This service plots the zonal >>>>mean >>>> value of a two-dimensional variable as a function of the latitude in >>>>terms >>>> of an X-Y plot. >>>> >>>> * Two dimensional variable time series: This service displays the >>>>average >>>> of a two-dimensional variable over the specific region as function of >>>>time >>>> as an X-Y plot. >>>> >>>> 2. '''Three dimensional variable services''' >>>> >>>> * Map of a two dimensional slice of a three-dimensional variable: >>>>This >>>> service displays a two-dimensional slice of a three-dimensional >>>>variable >>>> at a specific altitude as a colored longitude and latitude map with >>>>values >>>> represented by a color scheme. >>>> >>>> * Three dimensional zonal mean: Zonal mean of the specified >>>> three-dimensional variable is computed and displayed as a colored >>>> altitude-latitude map. >>>> >>>> * Vertical profile of a three-dimensional variable: Compute the area >>>> weighted average of a three-dimensional variable over the specified >>>>region >>>> and display the average as function of pressure level (altitude) as >>>>an X-Y >>>> plot. >>>> >>>> 3. '''General services''' >>>> >>>> * Difference of two variables: This service displays the differences >>>> between the two variables, which can be either a two dimensional >>>>variable >>>> or a slice of a three-dimensional variable at a specified altitude as >>>> colored longitude and latitude maps >>>> >>>> * Scatter and histogram plots of two variables: This service >>>>displays the >>>> scatter plot (X-Y plot) between two specified variables and the >>>>histograms >>>> of the two variables. The number of samples can be specified and the >>>> correlation is computed. The two variables can be either a >>>>two-dimensional >>>> variable or a slice of a three-dimensional variable at a specific >>>>altitude. >>>> >>>> * Conditional sampling: This service lets user to sort a physical >>>> quantity of two or dimensions according to the values of another >>>>variable >>>> (environmental condition, e.g. SST) which may be a two-dimensional >>>> variable or a slice of a three-dimensional variable at a specific >>>> altitude. For a two dimensional quantity, the plot is displayed an X-Y >>>> plot, and for a two-dimensional quantity, plot is displayed as a >>>> colored-map. >>>> >>>> >>>> == Background and Rationale == >>>> >>>> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth >>>> Assessment Report stressed the need for the comprehensive and >>>>innovative >>>> evaluation of climate models with newly available global >>>>observations. The >>>> traditional approach to climate model evaluation, which is the >>>>comparison >>>> of a single parameter at a time, identifies symptomatic model biases >>>>and >>>> errors but fails to diagnose the model problems. The model diagnosis >>>> process requires physics-based multi-variable comparisons, which >>>>typically >>>> involve large-volume and heterogeneous datasets, and computationally >>>> demanding and data-intensive operations. We propose to develop a >>>> computationally efficient information system to enable the >>>>physics-based >>>> multi-variable model performance evaluations and diagnoses through the >>>> comprehensive and synergistic use of multiple observational data, >>>> reanalysis data, and model outputs. >>>> >>>> Satellite observations have been widely used in model-data >>>> inter-comparisons and model evaluation studies. These studies normally >>>> involve the comparison of a single parameter at a time using a time >>>>and >>>> space average. For example, modeling cloud-related processes in global >>>> climate models requires cloud parameterizations that provide >>>>quantitative >>>> rules for expressing the location, frequency of occurrence, and >>>>intensity >>>> of the clouds in terms of multiple large-scale model-resolved >>>>parameters >>>> such as temperature, pressure, humidity, and wind. One can evaluate >>>>the >>>> performance of the cloud parameterization by comparing the cloud water >>>> content with satellite data and can identify symptomatic model biases >>>>or >>>> errors. However, in order to understand the cause of the biases and >>>> errors, one has to simultaneously investigate several parameters that >>>>are >>>> integrated in the cloud parameterization. >>>> >>>> Such studies, aimed at a multi-parameter model diagnosis, require >>>> locating, understanding, and manipulating multi-source observation >>>> datasets, model outputs, and (re)analysis outputs that are physically >>>> distributed, massive in volume, heterogeneous in format, and provide >>>> little information on data quality and production legacy. >>>>Additionally, >>>> these studies involve various data preparation and processing steps >>>>that >>>> can easily become computationally demanding since many datasets have >>>>to be >>>> combined and processed simultaneously. It is notorious that scientists >>>> spend more than 60% of their research time on just preparing the >>>>dataset >>>> before it can be analyzed for their research. >>>> >>>> To address these challenges, we propose to build Climate Model >>>>Diagnostic >>>> Analyzer (CMDA) that will enable a streamlined and structured >>>>preparation >>>> of multiple large-volume and heterogeneous datasets, and provide a >>>> computationally efficient approach to processing the datasets for >>>>model >>>> diagnosis. We will leverage the existing information technologies and >>>> scientific tools that we developed in our current NASA ROSES COUND, >>>>MAP, >>>> and AIST projects. We will utilize the open-source Web-service >>>>technology. >>>> We will make CMDA complementary to other climate model analysis tools >>>> currently available to the research community (e.g., PCMDI’s CDAT and >>>> NCAR’s CCMVal) by focusing on the missing capabilities such as >>>>conditional >>>> sampling, and probability distribution function and cluster analysis >>>>of >>>> multiple-instrument datasets. The users will be able to use a web >>>>browser >>>> to interface with CMDA. >>>> >>>> == Current Status == >>>> >>>> The current version of ClimateModelDiagnosticAnalyzer was developed >>>>by a >>>> team at The Jet Propulsion Laboratory (JPL). The project was >>>>initiated as >>>> a NASA-sponsored project (ROSES-CMAC) in 2011. >>>> >>>> == Meritocracy == >>>> >>>> The current developers are not familiar with meritocratic open source >>>> development at Apache, but would like to encourage this style of >>>> development for the project. >>>> >>>> == Community == >>>> >>>> While ClimateModelDiagnosticAnalyzer started as a JPL research >>>>project, it >>>> has been used in The 2014 Caltech Summer School sponsored by the JPL >>>> Center for Climate Sciences. Some 23 students from different >>>>institutions >>>> over the world participated. We deployed the tool to the Amazon Cloud >>>>and >>>> let every student each has his or her own virtual machine. Students >>>>gave >>>> positive feedback mostly on the usability and speed of our web >>>>services. >>>> We also collected a number of enhancement requests. We seek to further >>>> grow the developer and user communities using the Apache open source >>>> venue. During incubation we will explicitly seek increased academic >>>> collaborations (e.g., with The Carnegie Mellon University) as well as >>>> industrial participation. >>>> >>>> One instance of our web services can be found at: >>>> http://cmacws.jpl.nasa.gov:8080/cmac/ >>>> >>>> == Core Developers == >>>> >>>> The core developers of the project are JPL scientists and software >>>> developers. >>>> >>>> == Alignment == >>>> >>>> Apache is the most natural home for taking the >>>> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned >>>>with >>>> some Apache projects such as Apache Open Climate Workbench. >>>> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style >>>> development model; it is seeking a broader community of contributors >>>>and >>>> users in order to achieve its full potential and value to the Climate >>>> Science and Big Data community. >>>> >>>> There are also a number of dependencies that will be mentioned below >>>>in >>>> the Relationships with Other Apache products section. >>>> >>>> >>>> == Known Risks == >>>> >>>> === Orphaned products === >>>> >>>> Given the current level of intellectual investment in >>>> ClimateModelDiagnosticAnalyzer, the risk of the project being >>>>abandoned is >>>> very small. The Carnegie Mellon University and JPL are collaborating >>>> (2014-2015) to build a service for climate analytics workflow >>>> recommendation using fund from NASA. A two-year NASA AIST project >>>> (2015-2016) will soon start to add diagnostic analysis methodologies >>>>such >>>> as conditional sampling method, conditional probability density >>>>function, >>>> data co-location, and random forest. We will also infuse the >>>>provenance >>>> technology into CMDA so that the history of the data products and >>>> workflows will be automatically collected and saved. This information >>>>will >>>> also be indexed so that the products and workflows can be searchable >>>>by >>>> the community of climate scientists and students. >>>> >>>> === Inexperience with Open Source === >>>> >>>> The current developers of ClimateModelDiagnosticAnalyzer are >>>>inexperienced >>>> with Open Source. However, our Champion Chris Mattmann is experienced >>>> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be >>>> working closely with us, also as the Chief Architect of our JPL >>>>section. >>>> >>>> === Relationships with Other Apache Products === >>>> >>>> Clearly there is a direct relationship between this project and the >>>>Apache >>>> Open Climate Workbench already a top level Apache project and also >>>>brought >>>> to the ASF by its Champion (and ours) Chris Mattmann. We plan on >>>>directly >>>> collaborating with the Open Climate Workbench community via our >>>>Champion >>>> and we also welcome ASF mentors familiar with the OCW project to help >>>> mentor our project. In addition our team is extremely welcoming of ASF >>>> projects and if there are synergies with them we invite participation >>>>in >>>> the proposal and in the discussion. >>>> >>>> === Homogeneous Developers === >>>> >>>> The current community is within JPL but we would like to increase the >>>> heterogeneity. >>>> >>>> === Reliance on Salaried Developers === >>>> >>>> The initial committers are full-time JPL staff from 2013 to 2014. The >>>> other committers from 2014 to 2015 are a mix of CMU faculty, students >>>>and >>>> JPL staff. >>>> >>>> === An Excessive Fascination with the Apache Brand === >>>> >>>> We believe in the processes, systems, and framework Apache has put in >>>> place. Apache is also known to foster a great community around their >>>> projects and provide exposure. While brand is important, our >>>>fascination >>>> with it is not excessive. We believe that the ASF is the right home >>>>for >>>> ClimateModelDiagnosticAnalyzer and that having >>>> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better >>>> long-term outcome for the Climate Science and Big Data community. >>>> >>>> === Documentation === >>>> >>>> The ClimateModelDiagnosticAnalyzer services and documentation can be >>>>found >>>> at: http://cmacws.jpl.nasa.gov:8080/cmac/. >>>> >>>> === Initial Source === >>>> >>>> Current source resides in ... >>>> >>>> === External Dependencies === >>>> >>>> ClimateModelDiagnosticAnalyzer depends on a number of open source >>>>projects: >>>> >>>> * Flask >>>> * Gunicorn >>>> * Tornado Web Server >>>> * GNU octave >>>> * epd python >>>> * NOAA ferret >>>> * GNU plot >>>> >>>> == Required Resources == >>>> >>>> === Developer and user mailing lists === >>>> >>>> * [email protected] <javascript:;> (with moderated >>>> subscriptions) >>>> * [email protected] <javascript:;> >>>> * [email protected] <javascript:;> >>>> * [email protected] <javascript:;> >>>> >>>> A git repository >>>> >>>> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git >>>> >>>> A JIRA issue tracker >>>> >>>> https://issues.apache.org/jira/browse/CMDA >>>> >>>> === Initial Committers === >>>> >>>> The following is a list of the planned initial Apache committers (the >>>> active subset of the committers for the current repository at Google >>>>code). >>>> >>>> * Seungwon Lee ([email protected] <javascript:;>) >>>> * Lei Pan ([email protected] <javascript:;>) >>>> * Chengxing Zhai ([email protected] <javascript:;>) >>>> * Benyang Tang ([email protected] <javascript:;>) >>>> >>>> >>>> === Affiliations === >>>> >>>> JPL >>>> >>>> * Seungwon Lee >>>> * Lei Pan >>>> * Chengxing Zhai >>>> * Benyang Tang >>>> >>>> CMU >>>> >>>> * Jia Zhang >>>> * Wei Wang >>>> * Chris Lee >>>> * Xing Wei >>>> >>>> == Sponsors == >>>> >>>> NASA >>>> >>>> === Champion === >>>> >>>> Chris Mattmann (NASA/JPL) >>>> >>>> === Nominated Mentors === >>>> >>>> TBD >>>> >>>> === Sponsoring Entity === >>>> >>>> The Apache Incubator >>>> >>>> >>>> >>>> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Chris Mattmann, Ph.D. >>>> Chief Architect >>>> Instrument Software and Science Data Systems Section (398) >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>> Office: 168-519, Mailstop: 168-527 >>>> Email: [email protected] <javascript:;> >>>> WWW: http://sunset.usc.edu/~mattmann/ >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Adjunct Associate Professor, Computer Science Department >>>> University of Southern California, Los Angeles, CA 90089 USA >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> >>>> >>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> <javascript:;> >>>> For additional commands, e-mail: [email protected] >>>> <javascript:;> >>>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > >--------------------------------------------------------------------- >To unsubscribe, e-mail: [email protected] >For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
