Bob, if "real time" means "up to a few minutes is acceptable" then I'd 
recommend you use storm to do any pre-load processing and write the result to a 
text/csv/etc file in a directory.  Then use a seperate utility (most databases 
have something that does this) to load data from the files you create into the 
database.

This sounds slower, but remember that establishing a connection to a database 
to run a SQL INSERT has noticable latency.  It's also true that each connection 
(usually) takes a port/socket, memory and is often a seperate OS task so you 
are consuming resources that you would probably want storm using.

There are other solutions for something closer to real time, but they require 
an in-memory database or "fun with caching" which will require specialized 
expertise.

HTH



________________________________
From: Adaryl "Bob" Wakefield, MBA [[email protected]]
Sent: Friday, March 06, 2015 7:54 PM
To: [email protected]
Subject: real time warehouse loads

I’m looking at storm as a method to load data warehouses in real time. I am not 
that familiar with Java. I’m curious about the actual mechanism to load records 
into tables. Is it just a matter of feeding the final result of processing into 
a INSERT INTO SQL statement or is it more complicated than that? It seems to me 
that hammering the database with SQL statements of real time data is a bit 
inefficient.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

Reply via email to