Hi All, I'm in the analysis phase of trying to work out an alternative database to IBM DB2. Primary drivers for change are cost (unsurprisingly) but also flexibility. We have a lot of SQL code and stored procedures and updating the data model is a nightmare in terms of maintainability and even more so test effort.
Our data is in the travel domain holding data on every flight from all the major airlines So as we have entities such as Flight that would have a Route and each Route would have a departure and arrival airport and an airline that operates the flight, an aircraft type that the passengers would be on etc. Inbound data is very airline centric, but we build a warehouse where we build data products which would aggregate data across all carriers, or sub sets etc. Output is data files in the main, but also some web services I was thinking that graph databases might be a good fit. I'm looking at Neo4j because 90% of our code is implemented in Java There are lots of other data entities that need to be modelled, but I was interested if anyone had any experience of moving from RDBMS to Neo4j Our data set isn't really big data as the industry traditionally summarises flight data by a period and days of operation (so a flight operates from 01MARxx to 30SEPxx on days MonTueWed for example. Maybe it's a different aircraft on Thurs/Friday, so we have a different flight entity on those days. Periods can overlap, but not period and days such that for a combination of airline, flight number, period and days forms a logical key. We have about 7M flight records, but some peripheral data can be in the 100M (booking classes is the key example of that. Each flight has a booking class and the booking class is grouped in to cabins such as First, Business, Premium Economy and Economy). However, in terms of performing updates this summation can cause lots of fragmentation and load on the DB2 database, whereas a thought I had would be to decompose the data into individual days and then aggregate back up using something like a map / reduce approach Other database solutions might work better with a map/reduce model, but it's the network aspect of our data that appeals wrt graphs dbs. Any thoughts or wisdom ? Cheers for your help Dan -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
