Hi,

I want to use neo4j to analyse apache logs. Each visitor is identified by a 
session ID, and each visit has an ID too - a new ID is assigned after 20 
minutes between page views. The graph consists of Visit, Visitor and Page nodes 
where a Visitor -> multiple Visit nodes, and a Visit -> multiple Page nodes. 
The Visit -> Page relationship has a property to indicate when in the visit the 
page was visited (i.e. 1 = first page visited, 2 = second page in the visit, 
etc).

How would I best go about importing data into this graph? I'd use the batch 
inserter, but before I create a new Visitor or Visit I need to check whether a 
node exists already with the same ID. I've read that it's better to use the 
EmbeddedGraphDatabase, but I'm going to be inserting ~ 100K nodes 3 times per 
day. In the past using MySQL in a similar way performance was abysmal, so I 
couldn't do take this approach.

Will I be able to just use an EmbeddedGraphDatabase, or should I have a 
rethink? How is performance likely to be for these inserts?

Thanks
Tim



      
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to