Hi Roger, Great to hear from you, and thanks for considering Apache OODT!
I would say, at a high-level, Apache OODT is a project centered around three themes: 1. Data management and archival 2. Data processing 3. Data sharing Depending on your use case, it would make sense to first identify which of those you are interested in, and then investigate the relevant modules in-depth. OODT is a component-based architecture, so one can just use modules independently of one another if so desired (or use a packaged bundle, like mentioned in the Quick Start section below). Much of our documentation is currently on our wiki [1], so that is a good place to start. Here are some resources: Quick-start with OODT: * Vagrant Virtual Machine with latest OODT (all components) pre-installed: https://cwiki.apache.org/confluence/display/OODT/Vagrant+Powered+OODT * RADiX (i.e. OODT, all components, packaged together through a single Maven build): https://cwiki.apache.org/confluence/display/OODT/RADiX+Powered+By+OODT Data Management and Archival (i.e. taking raw products, extracting metadata, archiving metadata and products) * File Manager Developer Guide: http://oodt.apache.org/components/maven/filemgr/development/developer.html * File Manager Policy (i.e. describing the nature of your products for archival): https://cwiki.apache.org/confluence/display/OODT/Everything+you+want+to+know+about+File+Manager+Policy * Crawler (i.e. how to get your products into File Manager): https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help Data Processing (i.e. transforming data already archived or to-be archived): * Workflow Manager Developer Guide: http://oodt.apache.org/components/maven/workflow/development/developer.html * CAS-PGE Learn By Example (i.e. how to wrap your external algorithms into workflows): https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example Data Sharing (i.e. sharing and accessing your archive between machines) * Web-grid overview: http://oodt.apache.org/components/maven/grid/slides.pdf To answer your second question, of experiences dealing with OODT, I've personally been using it for archival management for at least two climate science projects at NASA Jet Propulsion Laboratory, and for data processing needs for two other projects. I think the integration of the Solr catalog makes OODT an attractive choice for metadata cataloging and the workflow manager makes the creation and execution of batch jobs involving external algorithms easier. There's sort of a high-learning curve for OODT (we are working on improving documentation!), but once you get the hang of the components, its definitely a useful software package. Hope that helps! Rishi -- [1] https://cwiki.apache.org/confluence/display/OODT/Home On Aug 6, 2014, at 2:01 PM, Roger Carter wrote: Hi Everyone, I'm new to the apache scene; I have experience with Matlab and minimal experience with Python. This seems like a powerful tool and I'd like to learn more. If anyone is willing to provide reccomendations for resources or detail their experiences in learning Apache OODT, I would be most grateful. Thanks, Roger
