What I'd do for this task. In fact we have a three dimensional model:
1 time (you omitted it completely, and I'm perfectly sure that business
will be interested in this dimension!) modelled as a tree of nodes,
probably granularity level of days is sufficient? or maybe hours?
2 locations
3 trips (not trucks)
Truck is a root of a "brush" of trips which are directed linked lists of
events in time, each trip starts with departure event, each stop is 2
events (arrival and departure), trip ends with arrival event. Doing stats
per truck won't be too big a problem because trips are (comparatively)
short, maybe 1-2 dozens of stops.
Each event has relationship to one calendar :Day node with exact event
timestamp stored in the property. (So you will ask, If we go this way, do
we need to store event sequences in trip lists? answer is "yes" and later
you may store some useful business data in the properties of relations
which link events in pairs).
Also each event has relationship to one location node.
How to cope with 3+ k locations, 50+ k trucks, 20+ millions of trips? The
answer again is... time. Code time periods in relationship types, I'd go
for a scale of a week. So for a single relationship type you will get 52
new instances per year, say 2 events per truck per day - 700+ k of events
per week and 20% (1k) of locations will take 80% (560k) of those, so each
week will add some 300 incoming and 300 outgoing relationships per
location. Not that big a problem, especially if you decide to create a new
node per location each week. Some inconvenience will occur when processing
long trips which cross week boundaries, nothing terrific. So consider the
following fragment
(city:Location {name: 'Munich'})<-[ra:ARRIVED_AT_2015week4]-(e0:Event:
Arrival)-[rb:ARRIVED_ON_2015week4]->(d:Day {printable_date: '2015.01.25'})
(city:Location {name: 'Munich'})-[rc:DEPARTED_AT_2015week5]->(e1:Event:
Departure)<-[rd:DEPARTED_ON_2015week5]->(d:Day {printable_date: '2015.01.26'
})
(t:Truck {license_plate: 'ABC123'})<-[:TRIP_2015week5]-(e1:Event:Departure
)<-[:TRIP_2015week4]-(e0:Event:Arrival) // direction in sequence
As for your code, am I correct in assumption that you just want to discover
all paths between Munich and Hamburg which
a) do not contain circles,
b) were actually implemented in reality at least once.
If yes, and if we want to examine data from all weeks, won't this
MATCH (src:Location {name: 'Munich'})-->(e0:Event:Departure)
WITH e0
MATCH (e1:Event:Arrival)-->(dst:Location {name: 'Hamburg'})
WITH e0, e1 MATCH p = (e0)-[r*1..]->(e1)
WITH DISTINCT p RETURN p ORDER BY LENGTH(p) DESC;
give you what you want?
WBR,
Andrii
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.