Re: Advice for new user

Verma, Rishi (398J) Wed, 06 Aug 2014 15:26:49 -0700

Hi Roger,

Great to hear from you, and thanks for considering Apache OODT!

I would say, at a high-level, Apache OODT is a project centered around three
themes:
1. Data management and archival
2. Data processing
3. Data sharing

Depending on your use case, it would make sense to first identify which of
those you are interested in, and then investigate the relevant modules
in-depth. OODT is a component-based architecture, so one can just use modules
independently of one another if so desired (or use a packaged bundle, like
mentioned in the Quick Start section below). Much of our documentation is
currently on our wiki [1], so that is a good place to start.

Here are some resources:

Quick-start with OODT:
* Vagrant Virtual Machine with latest OODT (all components) pre-installed:
https://cwiki.apache.org/confluence/display/OODT/Vagrant+Powered+OODT
* RADiX (i.e. OODT, all components, packaged together through a single Maven
build): https://cwiki.apache.org/confluence/display/OODT/RADiX+Powered+By+OODT

Data Management and Archival (i.e. taking raw products, extracting metadata,
archiving metadata and products)
* File Manager Developer Guide:
http://oodt.apache.org/components/maven/filemgr/development/developer.html
* File Manager Policy (i.e. describing the nature of your products for
archival):
https://cwiki.apache.org/confluence/display/OODT/Everything+you+want+to+know+about+File+Manager+Policy
* Crawler (i.e. how to get your products into File Manager):
https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help

Data Processing (i.e. transforming data already archived or to-be archived):
* Workflow Manager Developer Guide:
http://oodt.apache.org/components/maven/workflow/development/developer.html
* CAS-PGE Learn By Example (i.e. how to wrap your external algorithms into
workflows):
https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example

Data Sharing (i.e. sharing and accessing your archive between machines)
* Web-grid overview: http://oodt.apache.org/components/maven/grid/slides.pdf

To answer your second question, of experiences dealing with OODT, I've
personally been using it for archival management for at least two climate
science projects at NASA Jet Propulsion Laboratory, and for data processing
needs for two other projects. I think the integration of the Solr catalog makes
OODT an attractive choice for metadata cataloging and the workflow manager
makes the creation and execution of batch jobs involving external algorithms
easier. There's sort of a high-learning curve for OODT (we are working on
improving documentation!), but once you get the hang of the components, its
definitely a useful software package.

Hope that helps!

Rishi

--
[1] https://cwiki.apache.org/confluence/display/OODT/Home

On Aug 6, 2014, at 2:01 PM, Roger Carter wrote:

Hi Everyone,

I'm new to the apache scene; I have experience with Matlab and minimal
experience with Python. This seems like a powerful tool and I'd like to
learn more. If anyone is willing to provide reccomendations for resources
or detail their experiences in learning Apache OODT, I would be most
grateful.

Thanks,
Roger

Re: Advice for new user

Reply via email to