Hi Suruchi, I'll answer each question inline below.
On Wed, Jun 30, 2010 at 6:27 PM, Suruchi Deodhar <[email protected]>wrote: > Hello! > > I had a few questions regarding Batch insert and normal insert in neo4j: > > - All the properties of nodes need to be set initially while creating graph > db in batch insert mode.Can the values of a subset of the nodes be > updated/changed later on? Does this lead to any performance issues? > Yes, you can update the properties when running in normal operations mode (using the GraphDatabaseService API), but you could also use the setNodeProperties(...) and setRelationshipProperties(...) methods in the batch inserter, this latter option is not what the batch inserter is optimized for though. > > - While creating graph using EmbeddedGraphDatabase, I am reading from > Oracle > and creating in increments of around 20000. Nodes get created pretty fast > (2 > million nodes->10 minutes) > But while creating relationships in increments of 20000, I get the > following > error: > > *org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog close* > This is not something I recognize, could you provide a stack trace? My guess would be that you shut down the GraphDatabase after importing the nodes, so that when you start importing relationships, it is already closed. > > I am not shutting down the database intermediately in the code. > Is there any other reason becasue of which this error may occur? > > - Is there a difference in performance while running queries on a database > created using batch insert as opposed to one created using > EmbeddedGraphDatabase? I somehow am seeing significant performance > difference while running queries on my db created using batch insert in > comparison to inserts using GraphDatabaseService. This may be because I am > updating values of some nodes intermediately. Is anyone else facing similar > issues. > Do you have any suggestions. > If you create the graph using the EmbeddedGraphDatabase API and then start doing queries right after, you are going to get better performance than if you create the graph using the BatchInserter API, then start up an EmbeddedGraphDatabase for queries. This is because the EmbeddedGraphDatabase will put your nodes and relationships in the cache as you create them, meaning that large portions of your graph will already be cached when you start doing queries. Whereas the EmbeddedGraphDatabase that started up cold will have to start caching as you perform your queries. The second time you run the queries (on the same GraphDatabase instance) both cases should report about the same execution times. Cheers, -- Tobias Ivarsson <[email protected]> Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

