I agree Michael, there should be better way of doing 1,3. The problem with in-memory is that, it needs to be loaded with all existing (required) data, in order to add new vertices/edges. Its something like,
1. you have an existing graph, you add few vertices/edges for next steps to proceed (as they require presence of these vertices/edges) 2. Rollback the newly created vertices/edges after the logic is done. In order to do step 1 in memory, I may need the complete graph in-memory. On Sunday, December 7, 2014 3:08:41 PM UTC-5, Michael Hunger wrote: > > > > On Sun, Dec 7, 2014 at 7:30 PM, Amit Kumar <[email protected] > <javascript:>> wrote: > >> Ok, disabling the auto-indexer and all indices I am creating. Still no >> great gains. One thing I am doing is - >> >> Sounds complicated and unnecessary, what's the reason for that approach? > > >> 1. A logic that creates temporary vertices/edges in a transaction >> 2. calls another logic for it to proceed by seeing the presence of those >> vertices >> 3. Once call 2 finishes its logic, transaction in 1 is rolled back >> 4. As a result of step 2 completion, an asynchronous thread attempts to >> create more vertices/edges and commits this transaction. >> >> I suspect that the fake creation of nodes as part of step 1 for step 2 to >> proceed and then rolling it back is the one which is trying to slow down >> things.... >> > > Can't you do that in memory? I think moving decision making logic into the > transactional system (which includes disk flushes on commit) is not the > fastest way of guaranteeing. > >> >> >> On Sunday, December 7, 2014 12:23:01 PM UTC-5, Michael Hunger wrote: >>> >>> There are a lot of factors in play that affect performance: >>> >>> - virtualization and ceph >>> - tinkerpop indirection >>> - not sure about the batch-size of your updates >>> - # of indexes, esp. if you have both schema indexes as well as >>> relationship-indexes (I guess you don't need most of them) >>> >>> -> my suggestions would be: >>> - measure the virtualization impact if it affects operations too much >>> move closer to a real machine >>> - remove the indexes you don't really need, premature indexing is not >>> useful, evaluate if you really need them to *find initial nodes* >>> >>> *after* you tried those two and if it doesn't get better please come >>> back with your graph.db/messages.log ; data-model, data-size and queries >>> >>> Michael >>> >>> On Sun, Dec 7, 2014 at 5:52 PM, Chris Vest <[email protected]> >>> wrote: >>> >>>> My guess would be that it’s the index updates that are taking time. >>>> It’s usually the case for any database that supports secondary indexes, >>>> that they trade write performance for read performance. >>>> >>>> -- >>>> Chris Vest >>>> System Engineer, Neo Technology >>>> [ skype: mr.chrisvest, twitter: chvest ] >>>> >>>> >>>> On 07 Dec 2014, at 07:25, Amit Kumar <[email protected]> wrote: >>>> >>>> Hello Experts, >>>> >>>> Need guidance on a critical issue I am facing. Using tinkerpop >>>> blueprints 2.5 with community neo4j embedded mode, I am seeing gradual >>>> (very noticeable) performance hit while inserting a bunch of vertices and >>>> edges (< 50 vertices and 70 edges) in one iteration. The program is >>>> building vertices/edges based on business logic. >>>> >>>> Have tried setting cache_type to none, and have indices on almost all >>>> properties of vertices as well as edges with auto-indexer on. The first >>>> load (on a clean database) takes < 1 second for < 100 vertices and < 120 >>>> edges. Subsequent idempotent loads are getting slower by almost 800 milli >>>> seconds (inconsistent). However, the time taken keeps increasing when the >>>> database grows. >>>> >>>> NOTE: Program runs on a VM with data storage for the graph on CEPH. >>>> There is NO fancy gremlin queries etc while trying to determine if a >>>> vertex/edge already exists before inserting. >>>> >>>> Need quick help. Thanks in advance. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
