You could also give cascading lingual a try: http://www.cascading.org/lingual/ http://docs.cascading.org/lingual/1.0/
We have a connector for oracle ( https://github.com/Cascading/cascading-jdbc#oracle), so you could read the data from oracle do the processing on a hadoop cluster and write it back into oracle all via SQL or a combination of SQL and Java/Cascading ( https://github.com/Cascading/cascading-jdbc#in-lingual). - André On Thu, Dec 19, 2013 at 9:35 PM, Jay Vee <[email protected]> wrote: > We have a large relational database ( ~ 500 GB, hundreds of tables ). > > We have summary tables that we rebuild from scratch each night that takes > about 10 hours. > From these summary tables, we have a web interface that accesses the > summary tables to build reports. > > There is a business reason for doing a complete rebuild of the summary > tables each night, and using > views (as in the sense of Oracle views) is not an option at this time. > > If I wanted to leverage Big Data technologies to speed up the summary > table rebuild, what would be the first step into getting all data into some > big data storage technology? > > Ideally in the end, we want to retain the summary tables in a relational > database and have reporting work the same without modifications. > > It's just the crunching of the data and building these relational summary > tables where we need a significant performance increase. > > > -- André Kelpe [email protected] http://concurrentinc.com
