Following up on modelling approach 1) Michael sketched previously: With 50k trucks using a separate relationship type per truck is a bad idea since atm Neo4j supports 65k different relationship types.
Just as a stupid idea to discuss: * how about using 10k relationships types for all the trucks? TRUCK_ROUTE_0001 to TRUCK_ROUTE_9999. Each individual truck is mapped to a rel type using a modulo operation on it's identifier (think of consistent hashing). On average one rel type is shared between 5 trucks. * a traversal for a specific truck then need to follow just one out of 10k relationship types. Of course you still need to inspect the relationship property to decide which of the 5 trucks it is, but it should be faster than using the same reltype for all trucks. /Stefan 2015-02-18 10:07 GMT+01:00 Michael Hunger <[email protected]>: > Perhaps you can share some of your Expander code? > > Not really sure between what your edges are? > > > Two ideas: > > 1) How many trucks do you have? Perhaps it makes sense to encode the > truck-id as relationship-type? So you have fewer rels to check and can > benefit from the separated storage by rel-type and direction. > 2) Model the trip as a node connected to a truck, and all locations it > visited (perhaps/optionally even encode the location-id as rel-id but that > might be overkill) so you can quickly find all that started at "A" and then > check if the trip has a rel to "B" > > 3) Another more verbose approach be to model each trip as a sequence of > nodes (which are shadow nodes of the locations), connect the start-node of > the trip to the truck (optionally all trip-nodes of the trip to the truck). > And then have a relationship to each stop of the trip. > > I'd probably go with model #2 > > HTH Michael > > > Am 16.02.2015 um 12:54 schrieb [email protected]: > > I need some modelling advice. > > We want to store and analyse movement patterns. Think of trucks moving > through a logistic's networks. > We want to ask which truck has ever moved from location A to location B and > what was the sequence of intermediate stops they made to get there. > > In a later stage we also want to be able to ask this question if there is no > truck that has stopped at location A and B. Which trucks and which sequence > of stops would we have needed to get from A to B. > > Right now we modeled all locations as nodes and every trip a truck has ever > made as a separate edge. The edges are attributed with a truck ID and a > sequence number. > We wrote our custom expander class to be used with the traversal framework > and to take care of the sequence numbers and truck IDs to only get complete > sequences for individual trucks. > > However, this performs very badly. > Right now we have 300 locations/nodes and 300.000 trips/edges. Some stops > have 20.000 outgoing trips that we are checking for truck ID and sequence > number (for every outgoing relationship, get attributes and check) . > This performs too badly. 13 seconds for 900 sequences. > > Finally, we want to try to scale it to 3000 locations and 20.000.000 trips. > > > Do you have any alternative modelling ideas? > > Thanks a lot already. > > > ps: I was thinking of storing every trucks list as a long linear sequence > of stops/nodes. The nodes are additionally linked to some identifier Node > through a type of is relation: "stop x is location A". > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
