Ok, disabling the auto-indexer and all indices I am creating. Still no great gains. One thing I am doing is -
1. A logic that creates temporary vertices/edges in a transaction 2. calls another logic for it to proceed by seeing the presence of those vertices 3. Once call 2 finishes its logic, transaction in 1 is rolled back 4. As a result of step 2 completion, an asynchronous thread attempts to create more vertices/edges and commits this transaction. I suspect that the fake creation of nodes as part of step 1 for step 2 to proceed and then rolling it back is the one which is trying to slow down things.... On Sunday, December 7, 2014 12:23:01 PM UTC-5, Michael Hunger wrote: > > There are a lot of factors in play that affect performance: > > - virtualization and ceph > - tinkerpop indirection > - not sure about the batch-size of your updates > - # of indexes, esp. if you have both schema indexes as well as > relationship-indexes (I guess you don't need most of them) > > -> my suggestions would be: > - measure the virtualization impact if it affects operations too much move > closer to a real machine > - remove the indexes you don't really need, premature indexing is not > useful, evaluate if you really need them to *find initial nodes* > > *after* you tried those two and if it doesn't get better please come back > with your graph.db/messages.log ; data-model, data-size and queries > > Michael > > On Sun, Dec 7, 2014 at 5:52 PM, Chris Vest <[email protected] > <javascript:>> wrote: > >> My guess would be that it’s the index updates that are taking time. It’s >> usually the case for any database that supports secondary indexes, that >> they trade write performance for read performance. >> >> -- >> Chris Vest >> System Engineer, Neo Technology >> [ skype: mr.chrisvest, twitter: chvest ] >> >> >> On 07 Dec 2014, at 07:25, Amit Kumar <[email protected] <javascript:>> >> wrote: >> >> Hello Experts, >> >> Need guidance on a critical issue I am facing. Using tinkerpop blueprints >> 2.5 with community neo4j embedded mode, I am seeing gradual (very >> noticeable) performance hit while inserting a bunch of vertices and edges >> (< 50 vertices and 70 edges) in one iteration. The program is building >> vertices/edges based on business logic. >> >> Have tried setting cache_type to none, and have indices on almost all >> properties of vertices as well as edges with auto-indexer on. The first >> load (on a clean database) takes < 1 second for < 100 vertices and < 120 >> edges. Subsequent idempotent loads are getting slower by almost 800 milli >> seconds (inconsistent). However, the time taken keeps increasing when the >> database grows. >> >> NOTE: Program runs on a VM with data storage for the graph on CEPH. There >> is NO fancy gremlin queries etc while trying to determine if a vertex/edge >> already exists before inserting. >> >> Need quick help. Thanks in advance. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
