Hi All,

I'm in the analysis phase of trying to work out an alternative database to 
IBM DB2.
Primary drivers for change are cost (unsurprisingly) but also flexibility. 
We have a lot of SQL code and stored procedures and updating the data model 
is a nightmare in terms of maintainability and even more so test effort.

Our data is in the travel domain holding data on every flight from all the 
major airlines
So as we have entities such as Flight that would have a Route and each 
Route would have a departure and arrival airport and an airline that 
operates the flight, an aircraft type that the passengers would be on etc. 
Inbound data is very airline centric, but we build a warehouse where we 
build data products which would aggregate data across all carriers, or sub 
sets etc.
Output is data files in the main, but also some web services

I was thinking that graph databases might be a good fit.

I'm looking at Neo4j because 90% of our code is implemented in Java

There are lots of other data entities that need to be modelled, but I was 
interested if anyone had any experience of moving from RDBMS to Neo4j

Our data set isn't really big data as the industry traditionally summarises 
flight data by a period and days of operation (so a flight operates from 
01MARxx to 30SEPxx on days MonTueWed for example. Maybe it's a different 
aircraft on Thurs/Friday, so we have a different flight entity on those 
days. Periods can overlap, but not period and days such that for a 
combination of airline, flight number, period and days forms a logical key.

We have about 7M flight records, but some peripheral data can be in the 
100M (booking classes is the key example of that. Each flight has a booking 
class and the booking class is grouped in to cabins such as First, 
Business, Premium Economy and Economy). 

However, in terms of performing updates this summation can cause lots of 
fragmentation and load on the DB2 database, whereas a thought I had would 
be to decompose the data into individual days and then aggregate back up 
using something like a map / reduce approach

Other database solutions might work better with a map/reduce model, but 
it's the network aspect of our data that appeals wrt graphs dbs.

Any thoughts or wisdom ?

Cheers for your help

Dan


-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to