Hi Guys, 

We're currently working at JPL on figuring out how we can make a
Shark/Spark
interface for Apache OODT which can be used for ETL and workflow
management:

http://oodt.apache.org/

OODT currently supports RDBMS, Solr/Lucene, and we are also working on a
Gora
plugin for it too.

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-5th floor
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: William Kang <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Thursday, May 22, 2014 7:49 AM
To: "[email protected]" <[email protected]>
Subject: ETL and workflow management on Spark

>
>
>
>Hi,
>We are moving into adopting the full stack of Spark. So far, we have used
>Shark to do some ETL work, which is not bad but is not prefect either. We
>ended writing UDF and UDGF, UDAF that can be avoided if we could use Pig.
>
>
>Do you have any suggestions with the ETL solution in Spark stack?
>
>
>And did any one have a working work flow management solution with Spark?
>
>
>Many thanks.
>
>
>
>
>Cao

Reply via email to