Hi,
I want to use neo4j to analyse apache logs. Each visitor is identified by a
session ID, and each visit has an ID too - a new ID is assigned after 20
minutes between page views. The graph consists of Visit, Visitor and Page nodes
where a Visitor -> multiple Visit nodes, and a Visit -> multiple Page nodes.
The Visit -> Page relationship has a property to indicate when in the visit the
page was visited (i.e. 1 = first page visited, 2 = second page in the visit,
etc).
How would I best go about importing data into this graph? I'd use the batch
inserter, but before I create a new Visitor or Visit I need to check whether a
node exists already with the same ID. I've read that it's better to use the
EmbeddedGraphDatabase, but I'm going to be inserting ~ 100K nodes 3 times per
day. In the past using MySQL in a similar way performance was abysmal, so I
couldn't do take this approach.
Will I be able to just use an EmbeddedGraphDatabase, or should I have a
rethink? How is performance likely to be for these inserts?
Thanks
Tim
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user