Re: [Neo4j] Lucene index recovery
You could configure the lucene data source to auto rotate logs more frequently. If logs are large recovery takes longer time. -Johan On Mon, Oct 17, 2011 at 11:37 PM, Nuo Yan yan@gmail.com wrote: What if in production due to whatever reason the neo4j server died and in the case people have to start up a new server with the current snapshot of data (which would be data from a non-clean shut down). In such case, I don't think it's acceptable to table lots of time (hours for large indices) to bring the server back. Is there a best practice here? Thanks, Nuo On Thu, Sep 1, 2011 at 8:19 AM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Dima, are you shutting down your database correctly? Make sure you can database.shutdown() and wait for it to finish ... Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Thu, Sep 1, 2011 at 1:31 PM, Dima Gutzeit dima.gutz...@mailvision.com wrote: Dear list members, Each time I restart my server based on Neo4J I can see this in the logs: Sep 1, 2011 7:23:17 PM org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog doInternalRecovery INFO: Non clean shutdown detected on log [/opt/data/nioneo_logical.log.2]. Recovery started ... Sep 1, 2011 7:23:18 PM org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog doInternalRecovery INFO: Non clean shutdown detected on log [/opt/data/index/lucene.log.1]. Recovery started ... Operation which takes time ... lots of time. What is the correct way of preventing that when restarting ? Thanks in advance. Regards, Dima Gutzeit. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
On Thu, Sep 22, 2011 at 2:15 PM, st3ven st3...@web.de wrote: Hi Johan, I changed the settings as you described, but that changed the speed not really significantly. The previous configuration would make the machine use swap and that will kill performance. To store the degree as a property on each node is an option, but I want the node degree to be calculated from the graph database as I also want to check The problem is that you are trying to access a 85GB+ dataset using only 16GB RAM. The recommendation then is to aggregate the information (store the degree count as a property). Peter also mentioned using HA (cache sharding) but if you can just get some more RAM into the machine you will see an improvement. SSD disk would also help here since you are touching all edges in the graph while a mechanical disk (in this setup) will have horrible performance ( low throughput with 99% load on disk). There are SSD solutions that handle terabytes of data today and they are dropping in price. Regards, Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Hi Stephan, You could try lower the heap size to -Xmx2G and cache_type=weak with 10G memory mapped for relationships. The machine only has 16G RAM and will not be able to process such a large dataset at in-memory speeds. Another option is to calculate degree at insertion time and store it as a property on each node. Regards, Johan On Wed, Sep 21, 2011 at 12:44 PM, st3ven st3...@web.de wrote: Hi Linan, I just tried it with the outgoing relationships, but unfortunately that didn't speed things up. The size of my db is around 140GB and so it is not possible for me to dumb the full directory into a ramfs. My files on the hard disk have the following size: neostore.nodestore.db = 31MB neostore.relationshipstore.db = 85GB neostore.propertystore.db = 65GB neostore.propertystore.db.strings = 180MB Is there maybe a chance of reducing the size of my database? Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3355074.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TransactionFailureException
Hi, Did you get any error that caused the shutdown hook to run? Is there a tm_tx_log.2 that contains data? Regards, Johan On Fri, Sep 9, 2011 at 2:42 PM, skarab77 skara...@o2.pl wrote: Hi, I have the following problem: when my program crash in the middle of the transaction, I am not able to start my neo4j embedded store again. I get the following exception. Exception in thread main org.neo4j.graphdb.TransactionFailureException: Unable to start TM, no active tx log file found but found either tm_tx_log.1 or tm_tx_log.2 file, please set one of them as active or remove them. at org.neo4j.kernel.impl.transaction.TxManager.init(TxManager.java:175) at org.neo4j.kernel.impl.transaction.TxModule.start(TxModule.java:96) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:161) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:190) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:80) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:64) My settings are: Neo4j 1.4.1 (1.5M1) Community Version, Windows 7 64bit, JDK 1.6.27 32bit. The file tm_tx_log.1 is empty. My code is following the neo4j guideline: //http://wiki.neo4j.org/content/Transactions and I have also registered shutdownHook to close Neo4j in case of an error. Best Regards, Wojtek ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Size on disk, and number of properties
Removing the log files ending with .vversion number at runtime is perfectly safe to do but will turn off the ability to do incremental backups. You can however still perform live full backups. Configuring Neo4j with keep_logical_logs=false the logs will automatically be deleted upon rotation. -Johan On Sat, Sep 3, 2011 at 1:49 AM, Aseem Kishore aseem.kish...@gmail.com wrote: Thanks for the insights Johan! Regarding the existing disk space then, by far the bulk of it is from the logs. Is there a way to prune or garbage collect them? Is simply deleting the files safe? Should the db be off if I do that? Etc. Thanks much! Aseem On Tue, Aug 30, 2011 at 2:47 AM, Johan Svensson jo...@neotechnology.comwrote: Hi Aseem, This is actually expected behavior when performing file copy of running db and starting up with default configuration. If you remove the files ending with .id in the db directory on the local snapshot and start up setting rebuild_idgenerators_fast=false you should see the accurate amount of nodes, relationships and properties. Regarding the amount of properties not matching this could be due to a non clean shutdown on the production system. We are planing on improving this in the near future by allowing for more aggressive reuse of ids for properties. This will specifically improve things for workloads that perform a lot of property updates. -Johan On Tue, Aug 30, 2011 at 10:05 AM, Aseem Kishore aseem.kish...@gmail.com wrote: Hey guys, We do offline backups of our db on a semi-regular basis (every few days), where we (1) stop the running db, (2) copy its data directory and (3) restart the db. A few times early on, we did running backups -- but not the proper online way -- where we simply copied the data directory while the db was still running. (We did this during times where we were confident no requests were hitting the db.) We noticed that every time we did the running backup, the number of properties the web admin reported -- and the space on disk of the db -- would jump up quite a bit. We stopped doing that recently. But even now, both these numbers have gotten quite a bit higher than we expect to, and strangely, they seem to differ highly between the running db and the copies. What could be causing all of this? Here are our current numbers: *Production* - 2,338 nodes - 4,473 rels - 114,231 props (higher than we would expect it to be, but not by an order of magnitude) - *1.39 GB!* -- this is way unexpected, particularly since our db used to be in the ~10 KB ballpark, and we certainly haven't experienced hockey stick growth yet ;) The logical log only takes up 57 KB (0%) btw. *Local snapshot* - 2,338 nodes - 4,473 rels - *2,607,892 props!!!* -- ??? - *1.37 GB!* -- equally surprisingly high, but also interesting that it's less than the production db's size. 0 KB logical logs. I looked around the wiki and searched this mailing list but didn't find much clues here. But as requested on another thread, here's the output of `ls -lh data/graph.db/`: total 1474520 -rw-r--r-- 1 aseemk staff 11B Aug 30 00:46 active_tx_log drwxr-xr-x 52 aseemk staff 1.7K Aug 30 00:46 index/ -rw-r--r-- 1 aseemk staff 343B Aug 30 00:46 index.db -rw-r--r-- 1 aseemk staff 854K Aug 30 00:46 messages.log -rw-r--r-- 1 aseemk staff 36B Aug 30 00:46 neostore -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.id -rw-r--r-- 1 aseemk staff 26K Aug 30 00:46 neostore.nodestore.db -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.nodestore.db.id -rw-r--r-- 1 aseemk staff 62M Aug 30 00:46 neostore.propertystore.db -rw-r--r-- 1 aseemk staff 133B Aug 30 00:46 neostore.propertystore.db.arrays -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.arrays.id -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.id -rw-r--r-- 1 aseemk staff 1.0K Aug 30 00:46 neostore.propertystore.db.index -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.index.id -rw-r--r-- 1 aseemk staff 4.0K Aug 30 00:46 neostore.propertystore.db.index.keys -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.index.keys.id -rw-r--r-- 1 aseemk staff 69M Aug 30 00:46 neostore.propertystore.db.strings -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.strings.id -rw-r--r-- 1 aseemk staff 144K Aug 30 00:46 neostore.relationshipstore.db -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.relationshipstore.db.id -rw-r--r-- 1 aseemk staff 55B Aug 30 00:46 neostore.relationshiptypestore.db -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.relationshiptypestore.db.id -rw-r--r-- 1 aseemk staff 602B Aug 30 00:46 neostore.relationshiptypestore.db.names -rw-r--r
Re: [Neo4j] Size on disk, and number of properties
Hi Aseem, This is actually expected behavior when performing file copy of running db and starting up with default configuration. If you remove the files ending with .id in the db directory on the local snapshot and start up setting rebuild_idgenerators_fast=false you should see the accurate amount of nodes, relationships and properties. Regarding the amount of properties not matching this could be due to a non clean shutdown on the production system. We are planing on improving this in the near future by allowing for more aggressive reuse of ids for properties. This will specifically improve things for workloads that perform a lot of property updates. -Johan On Tue, Aug 30, 2011 at 10:05 AM, Aseem Kishore aseem.kish...@gmail.com wrote: Hey guys, We do offline backups of our db on a semi-regular basis (every few days), where we (1) stop the running db, (2) copy its data directory and (3) restart the db. A few times early on, we did running backups -- but not the proper online way -- where we simply copied the data directory while the db was still running. (We did this during times where we were confident no requests were hitting the db.) We noticed that every time we did the running backup, the number of properties the web admin reported -- and the space on disk of the db -- would jump up quite a bit. We stopped doing that recently. But even now, both these numbers have gotten quite a bit higher than we expect to, and strangely, they seem to differ highly between the running db and the copies. What could be causing all of this? Here are our current numbers: *Production* - 2,338 nodes - 4,473 rels - 114,231 props (higher than we would expect it to be, but not by an order of magnitude) - *1.39 GB!* -- this is way unexpected, particularly since our db used to be in the ~10 KB ballpark, and we certainly haven't experienced hockey stick growth yet ;) The logical log only takes up 57 KB (0%) btw. *Local snapshot* - 2,338 nodes - 4,473 rels - *2,607,892 props!!!* -- ??? - *1.37 GB!* -- equally surprisingly high, but also interesting that it's less than the production db's size. 0 KB logical logs. I looked around the wiki and searched this mailing list but didn't find much clues here. But as requested on another thread, here's the output of `ls -lh data/graph.db/`: total 1474520 -rw-r--r-- 1 aseemk staff 11B Aug 30 00:46 active_tx_log drwxr-xr-x 52 aseemk staff 1.7K Aug 30 00:46 index/ -rw-r--r-- 1 aseemk staff 343B Aug 30 00:46 index.db -rw-r--r-- 1 aseemk staff 854K Aug 30 00:46 messages.log -rw-r--r-- 1 aseemk staff 36B Aug 30 00:46 neostore -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.id -rw-r--r-- 1 aseemk staff 26K Aug 30 00:46 neostore.nodestore.db -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.nodestore.db.id -rw-r--r-- 1 aseemk staff 62M Aug 30 00:46 neostore.propertystore.db -rw-r--r-- 1 aseemk staff 133B Aug 30 00:46 neostore.propertystore.db.arrays -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.arrays.id -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.id -rw-r--r-- 1 aseemk staff 1.0K Aug 30 00:46 neostore.propertystore.db.index -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.index.id -rw-r--r-- 1 aseemk staff 4.0K Aug 30 00:46 neostore.propertystore.db.index.keys -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.index.keys.id -rw-r--r-- 1 aseemk staff 69M Aug 30 00:46 neostore.propertystore.db.strings -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.propertystore.db.strings.id -rw-r--r-- 1 aseemk staff 144K Aug 30 00:46 neostore.relationshipstore.db -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.relationshipstore.db.id -rw-r--r-- 1 aseemk staff 55B Aug 30 00:46 neostore.relationshiptypestore.db -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.relationshiptypestore.db.id -rw-r--r-- 1 aseemk staff 602B Aug 30 00:46 neostore.relationshiptypestore.db.names -rw-r--r-- 1 aseemk staff 9B Aug 30 00:46 neostore.relationshiptypestore.db.names.id -rw-r--r-- 1 aseemk staff 16B Aug 30 00:46 nioneo_logical.log.1 -rw-r--r-- 1 aseemk staff 4B Aug 30 00:46 nioneo_logical.log.active -rw-r--r-- 1 aseemk staff 945K Aug 30 00:46 nioneo_logical.log.v0 -rw-r--r-- 1 aseemk staff 16B Aug 30 00:46 nioneo_logical.log.v1 -rw-r--r-- 1 aseemk staff 33K Aug 30 00:46 nioneo_logical.log.v10 -rw-r--r-- 1 aseemk staff 11K Aug 30 00:46 nioneo_logical.log.v11 -rw-r--r-- 1 aseemk staff 32K Aug 30 00:46 nioneo_logical.log.v12 -rw-r--r-- 1 aseemk staff 16B Aug 30 00:46 nioneo_logical.log.v13 -rw-r--r-- 1 aseemk staff 12M Aug 30 00:46 nioneo_logical.log.v14 -rw-r--r-- 1 aseemk staff 1.4M Aug 30 00:46 nioneo_logical.log.v15 -rw-r--r-- 1 aseemk
Re: [Neo4j] BatchInserter with Lucene Index
Hi Dario, Could you post the error message and stacktrace? Did the error happen after initial import but still running in batch inserter mode or normal server/embedded transactional mode? Regards, Johan On Wed, Jun 29, 2011 at 4:30 PM, Dario Rexin dario.re...@xing.com wrote: Hi all, Recently i tried import a huge dataset into neo using the BatchInserter. I also used the BatchInserterIndex. The import itself went very well and i was able to read from the graph, but when i tried to insert something i always got an error saying that no new transaction could be created. Here’s how I used the index: // provider is a LuceneBatchInserterIndexProvider BatchInserterIndex urnIndex = provider.nodeIndex(urn, MapUtil.stringMap( type, exact )); for (Map.EntryString, Long entry : nodes.entrySet()) { MapString, Object props = MapUtil.map(urn, entry.getKey()); urnIndex.add(entry.getValue(), props); } I also called shutdown() on the provider and the BatchInserter afterwards. Is there anything i am missing? Cheers, Dario ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] NonWritableChannelException
Paul, This could be related to the wrapper bug found if you are running the server. If the server was under heavy load and entered GC trashing (JVM stopping all threads just running GC) the wrapper thought the server was unresponsive and restarted it. This problem will be fixed in the 1.4.M05 release. Regards, Johan On Tue, Jun 21, 2011 at 1:22 PM, Paul Bandler pband...@cseuk.co.uk wrote: The above exception is thrown from the call stack indicated below while traversing a neo4j graph using the EmbedesReadOnly database. Using 1.4M04. The application is running with 1gb of heap with defaulting all other parameters except cache_type=weak on windows. I found some reports of this exception being thrown at shutdown back in January but this is not happening at shutdown and I could find no posted resolution of that thread anyway. Can anyone suggest what the cause if this exception is? Thanks Paul Exception in thread main java.nio.channels.NonWritableChannelException at sun.nio.ch.FileChannelImpl.write(Unknown Source) at org.neo4j.kernel.impl.nioneo.store.AbstractPersistenceWindow.writeOut(AbstractPersistenceWindow.java:104) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:536) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:128) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:526) at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:327) at org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:114) at org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:97) at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:108) at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:603) at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:399) at org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:93) at org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:218) ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j proof of efficiency
Hi, This may be of interest http://arxiv.org/abs/1004.1001 (The Graph Traversal Pattern) and http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/ Regards, Johan 2011/6/27 Ian Bussières ian.bussieres.mailingli...@gmail.com: Hello, I am using neo4j in a school project. I was wondering if anyone could point me to a scientific paper or proof of concept - something with actual data - that would be useful to build a document that would prove graph databases to be more suited (performance, scalability and efficiency) to a social network-like application. I'm interested in any benchmarks or design keys of any sort, books, whatever. I've searched a lot but have failed to find enough relevant sources of information. Thanks ahead of time, Ian. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment
Great. Will merge that patch into trunk as soon as possible. -Johan On Thu, Jun 16, 2011 at 10:21 PM, Jennifer Hickey jhic...@vmware.com wrote: Hi Johan, Sorry for the delay. I was finally able to try out that patch (against 1.3) on our test environment, and things are running smoothly. I have not seen the ClosedChannelException (or any others) once in 24 hours. Previously on the same system I saw it frequently, as early as 15 minutes into the uptime. Thanks! Jennifer From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson [jo...@neotechnology.com] Sent: Thursday, May 26, 2011 3:09 AM To: Neo4j user discussions Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment Hi Jennifier, Could you apply this patch to the kernel and then see if the problem still exists? If you want I can send you a jar but then I need to know what version of Neo4j you are using. Regards, Johan On Mon, May 23, 2011 at 6:50 PM, Jennifer Hickey jhic...@vmware.com wrote: Hi Tobias, Looks like the environment is still setup, so I should be able to attempt a repro with a patched version. Let me know what you would like me to use. Thanks, Jennifer From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson [tobias.ivars...@neotechnology.com] Sent: Monday, May 16, 2011 11:01 PM To: Neo4j user discussions Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment Hi Jennifer, Could you reproduce it on your side by doing the same kind of systems tests again? If you could then I'd be very happy if you could try a patched version that we have been working on and see if that fixes the issue. Cheers, Tobias On Tue, May 17, 2011 at 2:49 AM, Jennifer Hickey jhic...@vmware.com wrote: Hi Tobias, Unfortunately I don't have an isolated test case, as I was doing a fairly involved system test at the time. I may be able to have a colleague work on reproducing it at a later date (I've been diverted to something else for the moment). I was remote debugging with Eclipse, so I toggled a method breakpoint on Thread.interrupt() and then inspected the stack once the breakpoint was hit. Sorry I don't have more information at the moment. I agree that eliminating the interrupts sounds like the best approach, if possible. Thanks, Jennifer From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson [tobias.ivars...@neotechnology.com] Sent: Thursday, April 28, 2011 6:23 AM To: Neo4j user discussions Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment Hi Jennifer, I'd first like to thank you for the testing and analysis you've done. Very useful stuff. Do you think you could send some test code our way that reproduces this issue? This is actually the first time this issue has been reported, so I wouldn't say it is a common issue. My guess is that your thread volume triggered a rare condition that wouldn't be encountered otherwise. I'm also curious to know how you found the source of the interruptions. When I debug thread interruptions I've never been able to find out where the thread got interrupted from without doing tedious procedures of breakpoint + logging + trying to match thread ids. If you have a better method for doing that I'd very much like to know. I think we should focus the effort on fixing the interruption issue if we can. And I believe we would be able to do that if the interruptions do in fact originate from where you say they do. But the suggestion of being able to switch the lucene directory implementation is still interesting, but as you point out since it has issues on some platforms it would be better if we could be rid of the interruption issue. Cheers, Tobias On Thu, Apr 28, 2011 at 12:41 AM, Jennifer Hickey jhic...@vmware.com wrote: Hello, I've been running some tests w/approx 400 threads reading various indexed property values. I'm running on 64 bit Linux. I was frequently seeing the ClosedChannelException below. The javadoc on Lucene's NIOFSDirectory states that Accessing this class either directly or indirectly from a thread while it's interrupted can close the underlying file descriptor immediately if at the same time the thread is blocked on IO. The file descriptor will remain closed and subsequent access to {@link NIOFSDirectory} will throw a {@link ClosedChannelException}. If your application uses either {@link Thread#interrupt()} or {@link Future#cancel(boolean)} you should use {@link SimpleFSDirectory} in favor of {@link NIOFSDirectory}. A bit of debugging revealed that the Thread.interrupts were coming from Neo4j, specifically in RWLock and MappedPersistenceWindow. So it seems like
Re: [Neo4j] Parallelism
Hi, That is possible (and even recommended). The Java API is thread safe (with the exception of batch inserter) both for reads and writes. Each thread may use its own transaction but it is not required to have a transaction when performing read operations (only for writes). Reading is lock free and will always read the last committed value. A multi core CPU is required to let the threads execute in parallel with the advantage of scaling reads with the available number of cores. It is not possible to have a transaction associated with more than one thread at a time (but you can suspend a transaction and resume it in another thread if needed). Regards, Johan On Fri, Jun 17, 2011 at 2:23 PM, Norbert Tausch w...@ntausch.de wrote: Hi, is it possible to traverse on a Neo4J graph DB using the Java API in parallel threads? Is this possible within one transaction or does every thread has to use its own transaction? Is the API thread-safe concerning read-only access? Is there any advantage of concerning parallelism when using Neo4j as an embedded DB? Best regards ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Unexpected error
Hi, Looks like there was an OOME during commit but commit partially succeeded (removing the xid branch id association for the xa resource) causing the subsequent rollback call to fail. To guarantee consistency the kernel will block all mutating operations after this and a restart + recovery has to be performed. To avoid this make sure you don't get OOME thrown. What cache configuration was this on? If running on strong reference cache (default is soft) that may be the cause. Another possible cause is that the transaction is to large so either increase heap or split large transactions into smaller ones. Regards, Johan On Thu, Jun 9, 2011 at 10:50 AM, Massimo Lusetti mluse...@gmail.com wrote: Hi All, I'm going to give a try again to my apps on neo4j with the current 1.4.M03 implementations. After a while I got this stack trace for which I hope someone could give me a clue: org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:104) at my.services.graphdb.Neo4jSourceImpl.addNodes(Neo4jSourceImpl.java:734) at $Neo4jSource_1306fa8fc0b.addNodes($Neo4jSource_1306fa8fc0b.java) at my.services.input.RowLineProcessorImpl.processLogLines(RowLineProcessorImpl.java:86) at $RowLineProcessor_1306fa8fc11.processLogLines($RowLineProcessor_1306fa8fc11.java) at $RowLineProcessor_1306fa8fc0f.processLogLines($RowLineProcessor_1306fa8fc0f.java) at my.services.input.PickUpPollerImpl$DirPoller.run(PickUpPollerImpl.java:168) at java.util.TimerThread.mainLoop(Timer.java:534) at java.util.TimerThread.run(Timer.java:484) Caused by: javax.transaction.HeuristicMixedException: Unable to rollback --- error in commit: java.lang.OutOfMemoryError: Java heap space --- error code for rollback: 0 at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:669) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:588) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:107) at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:85) ... 8 more Caused by: javax.transaction.xa.XAException: Unknown xid[GlobalId[NEOKERNL|9147195978689142839|809], BranchId[ 52 49 52 49 52 49 ]] at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.rollback(XaResourceManager.java:470) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.rollback(XaResourceHelpImpl.java:111) at org.neo4j.kernel.impl.transaction.TransactionImpl.doRollback(TransactionImpl.java:533) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:651) ... 11 more This are the command line options used to start the JVM: -Djava.awt.headless=true -XX:MaxPermSize=512m -Xms512m -Xmx2048m The box has 8G of RAM At the time of the exception the db had 4482380 nodes and 94613402 relationships, a lot of my relations goes to a single node. The operations as usual are simple insert in the DB with some checks on Index (RelationalIndex and plain Index). Any help is really appreciated Cheers -- Massimo http://meridio.blogspot.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Interesting Neo4J design question...unidirectional relationships
You could modify the structure of how the collection is stored so there are several chains that can be updated in parallel for each collection. Kind of how ConcurrentHashMap works with several locks. -Johan On Fri, Jun 10, 2011 at 12:16 AM, Rick Bullotta rick.bullo...@thingworx.com wrote: We seem to be encountering a lot of issues when attempting to do lots of reads/writes/deletes of nodes in a collection scenario, where the members of the collection (each a node w/properties) are linked to their collection (also a node) via a relationship. This creates a hot spot and concurrency issue apparently, which has led to some unpredictable performance. In this specific use case, the relationship is only meaningful in one direction, so I am considering creating a property on the members of type long, which corresponds to the node id of the collection node. This would seem to work, and would likely avoid the issues we're encountering, but it makes me feel a bit dirty to do so in a graph database. Any other suggestions? Any other workarounds for the issues with frequent updates to a node and its relationships? Many thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Servlet and Other Multi Threaded Environments
Hi, You can assume all Neo4j APIs are thread safe. If something is not thread safe it will be explicitly stated in the javadocs. If you keep all state that has to be shared between threads in the graph and all other state thread local you don't have to perform any external (or extra) synchronization at all. If you have state that is not graph data and needs to be shared between threads keep that in separate synchronization blocks without invoking any mutating operations on the graph. Regards, Johan On Sat, Jun 4, 2011 at 7:22 PM, McKinley mckinley1...@gmail.com wrote: I'm working with Neo4j as an EmbeddedGraphDatabase on a web server. It is not Servlets, but the same multi threaded concerns apply. Is this http://wiki.neo4j.org/content/Servlets_with_Neo4j still the most current example of dealing with multi threaded concerns? I see many mentions on avoiding unnecessary uses of synchronized in the current documentation at http://docs.neo4j.org/. Can those warnings in the documentation have a simple multi threaded example included too? Thanks, McKinley ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Persistent vs in-memory storage
Hi, Neo4j requires a filesystem and that filesystem may be mounted in RAM or not. You can only control the location of the graph store through the API but it has to be on a supported filesystem. On Linux a in memory graph db can easily be created using /dev/shm: GraphDatabaseService inMemoryGraphDb = new EmbeddedGraphDatabase( /dev/shm/graphdb ); Regards, Johan On Tue, Jun 7, 2011 at 7:24 AM, udayan khurana udayankhur...@gmail.com wrote: Hi all, I am curious to know whether I can control the storage of my graph with Neo4J through the API. I didn't find any documentation related to that. Thanks Udayan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Servlet and Other Multi Threaded Environments
Since it is volatile you could remove the synchronized on getGraphDatabase() but if your app depends on getting null graphdb when context has been destroyed you would have to rewrite the contextDestroyed to something like: public void contextDestroyed( ServletContextEvent event ) { synchronized ( GraphDatabaseContext.class ) { GraphDatabaseService neo = graphDb; graphDb = null; if ( neo != null ) neo.shutdown(); } } What we really want is final but not possible to do here. -Johan On Tue, Jun 7, 2011 at 10:12 AM, McKinley mckinley1...@gmail.com wrote: Johan, In that Servlet example is the synchronized get on the graphDb reference still necessary on the ServletContextListener? Thanks, McKinley On Tue, Jun 7, 2011 at 1:03 AM, Johan Svensson jo...@neotechnology.comwrote: Hi, You can assume all Neo4j APIs are thread safe. If something is not thread safe it will be explicitly stated in the javadocs. If you keep all state that has to be shared between threads in the graph and all other state thread local you don't have to perform any external (or extra) synchronization at all. If you have state that is not graph data and needs to be shared between threads keep that in separate synchronization blocks without invoking any mutating operations on the graph. Regards, Johan On Sat, Jun 4, 2011 at 7:22 PM, McKinley mckinley1...@gmail.com wrote: I'm working with Neo4j as an EmbeddedGraphDatabase on a web server. It is not Servlets, but the same multi threaded concerns apply. Is this http://wiki.neo4j.org/content/Servlets_with_Neo4j still the most current example of dealing with multi threaded concerns? I see many mentions on avoiding unnecessary uses of synchronized in the current documentation at http://docs.neo4j.org/. Can those warnings in the documentation have a simple multi threaded example included too? Thanks, McKinley ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment
Hi Jennifier, Could you apply this patch to the kernel and then see if the problem still exists? If you want I can send you a jar but then I need to know what version of Neo4j you are using. Regards, Johan On Mon, May 23, 2011 at 6:50 PM, Jennifer Hickey jhic...@vmware.com wrote: Hi Tobias, Looks like the environment is still setup, so I should be able to attempt a repro with a patched version. Let me know what you would like me to use. Thanks, Jennifer From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson [tobias.ivars...@neotechnology.com] Sent: Monday, May 16, 2011 11:01 PM To: Neo4j user discussions Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment Hi Jennifer, Could you reproduce it on your side by doing the same kind of systems tests again? If you could then I'd be very happy if you could try a patched version that we have been working on and see if that fixes the issue. Cheers, Tobias On Tue, May 17, 2011 at 2:49 AM, Jennifer Hickey jhic...@vmware.com wrote: Hi Tobias, Unfortunately I don't have an isolated test case, as I was doing a fairly involved system test at the time. I may be able to have a colleague work on reproducing it at a later date (I've been diverted to something else for the moment). I was remote debugging with Eclipse, so I toggled a method breakpoint on Thread.interrupt() and then inspected the stack once the breakpoint was hit. Sorry I don't have more information at the moment. I agree that eliminating the interrupts sounds like the best approach, if possible. Thanks, Jennifer From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson [tobias.ivars...@neotechnology.com] Sent: Thursday, April 28, 2011 6:23 AM To: Neo4j user discussions Subject: Re: [Neo4j] ClosedChannelExceptions in highly concurrent environment Hi Jennifer, I'd first like to thank you for the testing and analysis you've done. Very useful stuff. Do you think you could send some test code our way that reproduces this issue? This is actually the first time this issue has been reported, so I wouldn't say it is a common issue. My guess is that your thread volume triggered a rare condition that wouldn't be encountered otherwise. I'm also curious to know how you found the source of the interruptions. When I debug thread interruptions I've never been able to find out where the thread got interrupted from without doing tedious procedures of breakpoint + logging + trying to match thread ids. If you have a better method for doing that I'd very much like to know. I think we should focus the effort on fixing the interruption issue if we can. And I believe we would be able to do that if the interruptions do in fact originate from where you say they do. But the suggestion of being able to switch the lucene directory implementation is still interesting, but as you point out since it has issues on some platforms it would be better if we could be rid of the interruption issue. Cheers, Tobias On Thu, Apr 28, 2011 at 12:41 AM, Jennifer Hickey jhic...@vmware.com wrote: Hello, I've been running some tests w/approx 400 threads reading various indexed property values. I'm running on 64 bit Linux. I was frequently seeing the ClosedChannelException below. The javadoc on Lucene's NIOFSDirectory states that Accessing this class either directly or indirectly from a thread while it's interrupted can close the underlying file descriptor immediately if at the same time the thread is blocked on IO. The file descriptor will remain closed and subsequent access to {@link NIOFSDirectory} will throw a {@link ClosedChannelException}. If your application uses either {@link Thread#interrupt()} or {@link Future#cancel(boolean)} you should use {@link SimpleFSDirectory} in favor of {@link NIOFSDirectory}. A bit of debugging revealed that the Thread.interrupts were coming from Neo4j, specifically in RWLock and MappedPersistenceWindow. So it seems like this would be a common problem, though perhaps I am missing something? SimpleFSDirectory seems a bit of a performance bottleneck, so I switched to MMapDirectory and the problem did go away. I didn't see a way to switch implementations w/out modifying neo4j code, so I changed LuceneDataSource as follows: static Directory getDirectory( String storeDir, IndexIdentifier identifier ) throws IOException { MMapDirectory dir=new MMapDirectory(getFileDirectory( storeDir, identifier), null); if(MMapDirectory.UNMAP_SUPPORTED) { dir.setUseUnmap(true); } return dir; } So I'm wondering if others have seen this problem and/or if there is a recommended solution? Our product runs on quite a few different operating
Re: [Neo4j] question about remove and iterate in same transaction
Hi Jose, Does http://docs.neo4j.org/chunked/1.3/transactions-delete.html answer your question? Regards, Johan On Tue, May 24, 2011 at 4:34 AM, Jose Angel Inda Herrera jai...@estudiantes.uci.cu wrote: hello list, I wonder when a node will be removed in a transaction, since I have a transaction in which I delete a node in the graph, but I need to iterate lso nodes of the graph in the same transaction thanks, cheers ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Strange error that keep my database unable to start.
Hi, What version of Neo4j are you running and are there any other error messages written to console or to messages.log when you start up? Do you have neo4j-lucene-index component on the classpath? The global transaction log contains a transaction that included the Lucene data source (branch id 0x162374) but it can not find that data source so either neo4j-lucene-index is not on classpath or something fails when the lucene index is performing recovery. Regards, Johan On Fri, May 20, 2011 at 11:35 AM, Pere Urbon Bayes p...@moviepilot.com wrote: Hi! I have a test server where my new application is going to leave for a while, before going in to production. But today I fond that error, that was keeping the neo4j database to start. Any idea why is happened? Is quite dangerous when this happen and I can not start the database, something that if is gonna happening me into productions, will be not good, xD! Fri May 20 09:25:24 CEST 2011: TM opening log: .../releases/20110519155401/db/tm_tx_log.1 Fri May 20 09:25:24 CEST 2011: TM non resolved transactions found in ../releases/20110519155401/db/tm_tx_log.1 Fri May 20 09:25:24 CEST 2011: Startup failed No mapping found for branchId[0x162374] org.neo4j.graphdb.TransactionFailureException: No mapping found for branchId[0x162374] at org.neo4j.kernel.impl.transaction.XaDataSourceManager.getXaResource(XaDataSourceManager.java:185) at org.neo4j.kernel.impl.transaction.TxManager.getXaResource(TxManager.java:933) at org.neo4j.kernel.impl.transaction.TxManager.buildRecoveryInfo(TxManager.java:414) at org.neo4j.kernel.impl.transaction.TxManager.recover(TxManager.java:255) at org.neo4j.kernel.impl.transaction.TxManager.init(TxManager.java:179) at org.neo4j.kernel.impl.transaction.TxModule.start(TxModule.java:96) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:160) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:165) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:80) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces Kind regards, /purbon ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4J sizing calculations
Hi, This will depend on types of queries, access patterns and what the data look like. Could you provide some more information on what the data look like, specifically relationships traversed and properties loaded for a query? Regarding adding another machine to an already active cluster it is easy. Just configure it (assign an id and point it to the cluster) then startup it up. This will trigger replication of data to the new machine and once that is done it will be available. Regards, Johan On Sat, Apr 30, 2011 at 10:02 AM, Dima Gutzeit dima.gutz...@mailvision.com wrote: Dear list members, I am building a Neo4J cluster that should hold around 2 billion of nodes with ~5 billion properties. Data will be mostly accessed for read, about 90/10. Around 200,000 concurrent users will require mostly/read access to the database. To translate it to number of queries is up to 10,000 per second. I need to calculate the sizing of such cluster, number of machines, required RAM, CPU and disk space. Any suggestions ? Another questions is how complicated it to add a new machine to an active cluster, while system is running, its it achievable ? Thanks in advance. Regards, Dima Gutzeit. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Possible performance regression issue?
Rick, I wrote a few tests trying to reproduce the slowdown with larger batch size but could not. Larger batch size results in a stable throughput while small batch sizes will spend more time flushing to disk (creating and deleting relationships the way you describe). Could you provide a test case for this that triggers the problem? -Johan On Tue, Mar 22, 2011 at 12:53 PM, Rick Bullotta rick.bullo...@thingworx.com wrote: Hi, Johan. I've allocated 500M to the relationship store, so that's probably not the limitation (the current relationship store size on disk is about 100M). My thought is that we are manipulating a lot of relationships (adding/deleting) within the transaction, and in fact, some (many) of the relationships that are added during the transaction are deleted during the same transaction and never actually saved. The scenario is the creation of an ordered linked list using nodes/relationships, and as each new item is inserted, there are potentially 2-3 relationships that will be destroyed/created. In fact, if 5000 items are inserted, only 5002 relationships will be ultimately saved, although 15000+ will have been created in total, with 1 of them being deleted. I'm not sure how to optimize that much further, though I'll look into it. I was considering using the Lucene index, but it does not have an obvious way to allow us to traverse from both the beginning and the end of the index. Best, Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: Tuesday, March 22, 2011 5:56 AM To: Neo4j user discussions Subject: Re: [Neo4j] Possible performance regression issue? Could you start by verifying it is not GC related. Turn on verbose GC and see if larger transactions trigger GC pause times. Another possible cause could be that the relationship store file has grown so configuration needs to be tweaked. The OS may be flushing pages to disk when it should not. There is a guide how to investigate and tweak that when running on Linux http://wiki.neo4j.org/content/Linux_Performance_Guide This could also be an issue with the setup of the persistence windows when not using memory mapped buffers. I remember those settings got tweaked some after 1.1 release. We could try make some changes there but it would be better to first perform some profiling before doing that. Regards, Johan On Mon, Mar 21, 2011 at 11:07 PM, Rick Bullotta rick.bullo...@thingworx.com wrote: Here's the quick summary of what we're encountering: We are inserting large numbers of activity stream entries on a nearly constant basis. To optimize transactioning, we queue these up and have a single scheduled task that reads the entries from the queue and persists them to Neo. Within these transactions, it's possible that a very large number of relationships will be created and deleted (sometimes create and deleted all within the transaction, since we are managing something similar to an index). I've noticed that the time required to handle the inserts (not just the total, but the time per insert) degrades DRAMATICALLY if there are more than a few hundred entries to write. It is very fast if there are 100 entries in the batch, but very slow if there are over 1000. With Neo 1.1, we did not notice this behavior. We have tried Neo 1.2 and 1.3 and both seem to exhibit this behavior. Can anyone provide any insight into possible causes/fixes? Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Possible performance regression issue?
Could you start by verifying it is not GC related. Turn on verbose GC and see if larger transactions trigger GC pause times. Another possible cause could be that the relationship store file has grown so configuration needs to be tweaked. The OS may be flushing pages to disk when it should not. There is a guide how to investigate and tweak that when running on Linux http://wiki.neo4j.org/content/Linux_Performance_Guide This could also be an issue with the setup of the persistence windows when not using memory mapped buffers. I remember those settings got tweaked some after 1.1 release. We could try make some changes there but it would be better to first perform some profiling before doing that. Regards, Johan On Mon, Mar 21, 2011 at 11:07 PM, Rick Bullotta rick.bullo...@thingworx.com wrote: Here's the quick summary of what we're encountering: We are inserting large numbers of activity stream entries on a nearly constant basis. To optimize transactioning, we queue these up and have a single scheduled task that reads the entries from the queue and persists them to Neo. Within these transactions, it's possible that a very large number of relationships will be created and deleted (sometimes create and deleted all within the transaction, since we are managing something similar to an index). I've noticed that the time required to handle the inserts (not just the total, but the time per insert) degrades DRAMATICALLY if there are more than a few hundred entries to write. It is very fast if there are 100 entries in the batch, but very slow if there are over 1000. With Neo 1.1, we did not notice this behavior. We have tried Neo 1.2 and 1.3 and both seem to exhibit this behavior. Can anyone provide any insight into possible causes/fixes? Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException
Hi, I am assuming no manual modifying of log files or store files at runtime or between shutdowns/crashes and startups has been performed. What filesystem are you running this on (and with what configuration)? Massimo since you say it happen more and more if the db grows can you write a test case that starts from a clean db and triggers the problem? -Johan On Fri, Mar 11, 2011 at 9:52 AM, Massimo Lusetti mluse...@gmail.com wrote: On Thu, Mar 10, 2011 at 9:58 PM, Massimo Lusetti mluse...@gmail.com wrote: On Thu, Mar 10, 2011 at 6:11 PM, Axel Morgner a...@morgner.de wrote: Hi, I'm getting an InvalidRecordException org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Node[5] is neither firstNode[37781] nor secondNode[37782] for Relationship[188125] at org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:131) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.getMoreRelationships(NioNeoDbPersistenceSource.java:280) at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:100) at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:585) at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:358) at org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:115) when iterating through the relationships of a certain node: Node node = graphDb.getNodeById(sNode.getId()); IterableRelationship rels = node.getRelationships(relType, dir); for (Relationship r : rels) { - here the expeption occurs ... } I'm using 1.3.M03. Seems that the database is in an inconsitant state. Don't know how this could happen ... Greetings Axel ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user I'm encountering this same/similar exception quite ofter lately, with the same 1.3.M03 version, on FreeBSD 8.2 with OpenJDK Runtime Environment (build 1.6.0-b21) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode): org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Record[4130751] not in use at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:194) at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:96) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.connectRelationship(WriteTransaction.java:1435) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.relationshipCreate(WriteTransaction.java:1389) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaConnection$RelationshipEventConsumerImpl.createRelationship(NeoStoreXaConnection.java:256) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$NioNeoDbResourceConnection.relationshipCreate(NioNeoDbPersistenceSource.java:370) at org.neo4j.kernel.impl.persistence.PersistenceManager.relationshipCreate(PersistenceManager.java:153) at org.neo4j.kernel.impl.core.NodeManager.createRelationship(NodeManager.java:309) at org.neo4j.kernel.impl.core.NodeImpl.createRelationshipTo(NodeImpl.java:387) at org.neo4j.kernel.impl.core.NodeProxy.createRelationshipTo(NodeProxy.java:186) Did it rings and alert bell!? Cheers -- Massimo http://meridio.blogspot.com I still getting these errors more and more as the DB grows in size and node/relations numbers: org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Node[8388] is neither firstNode[0] nor secondNode[0] for Relationship[4925127] at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.getMoreRelationships(WriteTransaction.java:909) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaConnection$RelationshipEventConsumerImpl.getMoreRelationships(NeoStoreXaConnection.java:304) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$NioNeoDbResourceConnection.getMoreRelationships(NioNeoDbPersistenceSource.java:465) at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:100) at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:585) at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:358) at org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:115) Do you think the DB is getting corrupted? -- Massimo http://meridio.blogspot.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Help with exception using BatchInserter
Hello, I am having a hard time to follow what the problems really are since conversation is split up in several threads. Pablo, you had a problem with batch inserter throwing an exception upon shutdown that I suspected was due to not enough available disk space. Then there was the the to many open files. Are you still experiencing problems with that? Massimo, you had problems with injection that created duplicates due to a synchronization issue. That issue has been resolved and now you are experiencing a slowdown during batch inserter injection? Mark, I had a quick look at the code you posted and as I understand it you are saying that it is the index lookups that are taking to long time? That could very well be the case. If the index lookup has to go down to disk the lookup time will be a few ms and that will kill any large data injection. For example inserting 500M relationships requiring 1B index lookups (one for each node) with an avg index lookup time of 1ms is 11 days worth of index lookup time... We have come up with some ideas to avoid the index lookup problem when it can't fit in RAM (only a problem during big data injections). It would be possible to write a tool that splits the injection process up into several steps converting the data so we don't have to do index lookups while injecting relationships. However, before we write such a tool it would be great if you people could investigate some more what the problem really is and verify it is linked to index lookups and not something else. Regards, Johan On Thu, Feb 17, 2011 at 2:40 PM, Massimo Lusetti mluse...@gmail.com wrote: On Thu, Feb 17, 2011 at 12:54 PM, Pablo Pareja ppar...@era7.com wrote: Hi Massimo, It's too bad you are running into the same kind of situations, (specially when the conclusion you came up to is that Neo4j just degrades...). However, did you already try dividing the big insertion process into smaller steps? Well I do big transactions since the BatchInserter (from the wiki) is not an option for me and I'm doing 1 insert per transaction but as soon as the db grows performance drops inexorably. Here it a summary of the latest results which store nodes with only one String (IPv4 address) property each, it starts fom taking 1.05ms to insert a Node within a db with 440744 nodes and it end taking 8.75ms to insert a Node within a db with 12545155 nodes. The final DB size is: 2.9G since i tweaked the sintrg_block_size at graphdb creation time to 60bytes instead of 120... If anyone is interested I could provide the complete table of progression... I mean, do you think Neo4j degradation is just proportional to DB size ? It seems or at least to the Node's number but that is understandable, what make me think is that perfomance are so bad that they compromise usability, but I understand I could do something wrong. or rather just to the amount of data being inserted in the same Batch Insertion? As I said I use Big transaction pattern from the wiki not the Batch insert If anyone is interested I could provide more data... let me know, I hope to be able to use neo4j for this kind of work. -- Massimo http://meridio.blogspot.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4J and sharding
And if your domain is shardable you can still shard the same way you would do using a relational database when using a graph database. -Johan On Wed, Jan 19, 2011 at 10:17 AM, Jim Webber j...@neotechnology.com wrote: Hello Luanne, Right now the only viable approach would be cache sharding (i.e. not really sharding at all) whereby read requests for related bits of information are sent to a specific replica (thereby keeping those caches warm and relevant). Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Help needed! Obtaining the active transaction
Hi Rick, You should not cast it to a org.neo4j.graphdb.Transaction type but instead use the transaction manager to commit the transaction directly. Something like this: TransactionManager tm = (( EmbeddedGraphDatabase ) neo).getConfig().getTxModule().getTxManager(); tm.commit(); // will commit the current running tx tm.begin(); // will start a new transaction and the old top level org.neo4j.graphdb.Transaction will commit this one instead upon tx.finish() As I understand it you start a transaction somewhere using the GraphDatabaseService.beginTx() but then deeper down in the code you may not have access to the top level org.neo4j.graphdb.Transaction but still need to commit any work (to avoid to large transactions)? One way to do that is to just invoke tm.commit() followed by tm.begin(). Regarding active locks there is not API to check what locks the current transaction has. However if you only commit the running transaction using the TM and start a new one you do not have to think about the potential deadlock issue I mentioned. -Johan On Sat, Jan 8, 2011 at 4:41 PM, rick.bullo...@burningskysoftware.com wrote: Hi, Johan. Would the following code also be legal to commit the currently running transaction and start a new one? I'm casting the Transaction from tm.getTransaction to an org.neo4j.graphdb.Transaction type. The part I'm not sure about is whether tm.begin creates/enlists a neo transaction or not. Should I instead use graphDatabaseService.beginTx()? The other approach that I'm contemplating involves never actually keeping a reference to a Transaction object and to always get the current transaction from the transaction manager whenever calling success(), failure(), or finish(). Combining this with the below code allows the wrappers to always reference the currently running transaction as needed. The root cause for all of this is that I'm trying to do block commits on a lengthy operation that could involve many thousands of database operations, but have it function inside our existing wrapper(s). Thanks for any guidance. Rick === TransactionManager tm = (( EmbeddedGraphDatabase ) neo).getConfig().getTxModule().getTxManager(); Transaction currentTx = (Transaction)tm.getTransaction(); currentTx.success(); currentTx.finish(); tm.begin(); === Original Message Subject: Re: [Neo4j] Help needed! Obtaining the active transaction From: Johan Svensson [1]jo...@neotechnology.com Date: Fri, January 07, 2011 8:34 am To: Neo4j user discussions [2]u...@lists.neo4j.org You can use the TransactionManager suspend/resume for this. Suspend the current transaction and start a new one using the underlying TM. Have a look at [3]https://svn.neo4j.org/components/rdf-sail/trunk/src/main/java/org/ne o4j/rdf/sail/GraphDatabaseSailConnectionImpl.java to see how this can be done. You have to make sure the parent transaction that you temporary suspend does not have any locks on nodes or relationships you intend to update in the new transaction to avoid deadlocks. Regards, Johan On Thu, Jan 6, 2011 at 8:03 PM, Rick Bullotta [4]rick.bullo...@burningskysoftware.com wrote: We have a situation where we have a wrapper that automatically starts/commits Neo transactions as needed. However, in some cases, within the logic that is wrapped by this wrapper, we want to break the work up into smaller units (deleting a lot of nodes/relationships, in this case), however, there is already an active transaction. Any suggestions on how to handle this? Is there a way to grab the transaction from the thread context or somewhere else, so that we can commit it and start a new one? Many thanks, Rick ___ Neo4j mailing list [5]u...@lists.neo4j.org [6]https://lists.neo4j.org/mailman/listinfo/user References 1. mailto:jo...@neotechnology.com 2. mailto:user@lists.neo4j.org 3. https://svn.neo4j.org/components/rdf-sail/trunk/src/main/java/org/neo4j/rdf/sail/GraphDatabaseSailConnectionImpl.java 4. mailto:rick.bullo...@burningskysoftware.com 5. mailto:User@lists.neo4j.org 6. https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Johan Svensson [jo...@neotechnology.com] Chief Technology Officer, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Help needed! Obtaining the active transaction
You can use the TransactionManager suspend/resume for this. Suspend the current transaction and start a new one using the underlying TM. Have a look at https://svn.neo4j.org/components/rdf-sail/trunk/src/main/java/org/neo4j/rdf/sail/GraphDatabaseSailConnectionImpl.java to see how this can be done. You have to make sure the parent transaction that you temporary suspend does not have any locks on nodes or relationships you intend to update in the new transaction to avoid deadlocks. Regards, Johan On Thu, Jan 6, 2011 at 8:03 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: We have a situation where we have a wrapper that automatically starts/commits Neo transactions as needed. However, in some cases, within the logic that is wrapped by this wrapper, we want to break the work up into smaller units (deleting a lot of nodes/relationships, in this case), however, there is already an active transaction. Any suggestions on how to handle this? Is there a way to grab the transaction from the thread context or somewhere else, so that we can commit it and start a new one? Many thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException exception
On Thu, Dec 23, 2010 at 12:34 PM, George Ciubotaru george.ciubot...@weedle.com wrote: Taking a second look over that locking mechanism, I've noticed that it uses read locks for a delete operation. Should there be write locks instead? Yes, sorry about that it should be write locks. The read locks will still allow for concurrent transactions to progress past the higher level check if the relationship has been deleted or not (resulting in a InvalidRecordException instead of a NotFoundException). Regards, Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException exception
Rick, Yes, the patch I provided for you has been included in the milestone releases since M04. -Johan On Fri, Dec 17, 2010 at 5:53 PM, rick.bullo...@burningskysoftware.com wrote: Hi, Johan. Is this related to the patch you provided me for a similar issue? I had thought it made it into the milestone release(s). Thanks, Rick Original Message Subject: Re: [Neo4j] InvalidRecordException exception From: Johan Svensson [1]jo...@neotechnology.com Date: Wed, December 15, 2010 8:32 am To: Neo4j user discussions [2]u...@lists.neo4j.org This will still happen in the 1.2.M05 release. I just wanted to make sure I linked the stacktrace's line numbers to the right part of the code since that exception being thrown at a different place in the delete method could mean there are other problems. -Johan On Wed, Dec 15, 2010 at 1:42 PM, George Ciubotaru [3]george.ciubot...@weedle.com wrote: Yes, the version I'm currently using is 1.1. Shall I understand that this kind of issue shouldn't occur in 1.2.M05? For the moment I'll take the pessimistic approach by guarding against (as in the example you gave) to assure that this is the reason and then I'll just accept the exception. Thank you for your quick and detailed response. Best regards, George -Original Message- From: [4]user-boun...@lists.neo4j.org [[5]mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: 15 December 2010 12:23 To: Neo4j user discussions Subject: Re: [Neo4j] InvalidRecordException exception Sorry, should also have asked what Neo4j version you use but guessing it is 1.1 or early milestone release? If so I think the problem is caused by two or more concurrent transactions running delete on the same relationship. If two transactions get a reference to the same relationship and concurrently delete that relationship it is possible for a InvalidRecordException to be thrown instead of a NotFoundException since the write lock is grabbed after the relationship has been verified to exist. Solution is either to accept the exception or to guard against it by first acquiring a read lock on the relationship before invoking relationship.delete(). Code example how to do this: GraphDatabaseService gdb; // the graph db Relationship relationship; // the relationship to delete LockManager lockManager = ((EmbeddedGraphDatabase) gdb).getConfig().getLockManager(); lockManager.getReadLock( relationship ); try { relationship.delete(); } finally { lockManager.releaseReadLock( relationship ); } Regards, Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4J logs and the Embedded Server
At the moment the following information will be logged to the messages file: o How Neo4j was configured on startup o OS and JVM info together with available RAM and heap size o The Neo4j version used o Information about classpath in use o Garbage collector in use o Failure information related to startup and shutdown o Information related to recovery on startup after non clean shutdown o Logical log information (performing backups or rotation of the log) Normal operation when all is well or when user performs operations that are invalid (resulting in an exception) is not logged. /Johan On Wed, Dec 15, 2010 at 9:44 AM, Luanne Misquitta lmisqui...@saba.com wrote: Hi Johan, The JMX is great- however, I am also looking at cases where let's say a customer reports in a typical way, that 'something crashed'. Our only indication of anything that may have gone wrong e.g. transaction/locking/anything else would be a log which they could send us. It's perfectly ok if the application has to bear the responsibility for logging any such exceptions, but was just wondering if Neo4J also maintained such logs. I have seen the messages.txt- just not sure what type of events it logs. Thanks Luanne M. Tech Lead twitter / @luannem linkedin / http://in.linkedin.com/in/luannemisquitta skype / luanne.misquitta blog / http://thought-bytes.blogspot.com/ Saba. Power Up Your People. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: Tuesday, December 14, 2010 6:59 PM To: Neo4j user discussions Subject: Re: [Neo4j] Neo4J logs and the Embedded Server Hi, There are some logging performed through the java.util.logging.Logger and the org.neo4j.kernel.impl.util.StringLogger (check db-dir/messages.log). What kind of logging are you interested in? For normal monitoring and health check you can use JMX (http://wiki.neo4j.org/content/Monitoring_and_Deployment#Monitoring_via_JMX). We will add more logging in the future that can be turned on to help developers to debug problems (such as performance issues). Feedback on what kind of logging you would like to see is very welcomed. Regards, Johan On Tue, Dec 14, 2010 at 4:39 AM, Luanne Misquitta lmisqui...@saba.com wrote: Hi, Does Neo4J maintain running logs while using the embedded server, or must an application be responsible for logging? If Neo4J does log, can you configure log levels, etc.? Regards Luanne M. Tech Lead twitter / @luannem http://twitter.com/luannem linkedin / http://in.linkedin.com/in/luannemisquitta http://in.linkedin.com/in/luannemisquitta skype / luanne.misquitta blog / http://thought-bytes.blogspot.com/ http://thought-bytes.blogspot.com/ Saba. Power Up Your People. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException exception
Sorry, should also have asked what Neo4j version you use but guessing it is 1.1 or early milestone release? If so I think the problem is caused by two or more concurrent transactions running delete on the same relationship. If two transactions get a reference to the same relationship and concurrently delete that relationship it is possible for a InvalidRecordException to be thrown instead of a NotFoundException since the write lock is grabbed after the relationship has been verified to exist. Solution is either to accept the exception or to guard against it by first acquiring a read lock on the relationship before invoking relationship.delete(). Code example how to do this: GraphDatabaseService gdb; // the graph db Relationship relationship; // the relationship to delete LockManager lockManager = ((EmbeddedGraphDatabase) gdb).getConfig().getLockManager(); lockManager.getReadLock( relationship ); try { relationship.delete(); } finally { lockManager.releaseReadLock( relationship ); } Regards, Johan Two different transactions/threads invoke delete on either separate node On Wed, Dec 15, 2010 at 11:59 AM, George Ciubotaru george.ciubot...@weedle.com wrote: Yes, here it is: Caused by: org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Record[176917] not in use at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:190) at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:93) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.relDelete(WriteTransaction.java:655) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaConnection$RelationshipEventConsumerImpl.deleteRelationship(NeoStoreXaConnection.java:262) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$NioNeoDbResourceConnection.relDelete(NioNeoDbPersistenceSource.java:375) at org.neo4j.kernel.impl.persistence.PersistenceManager.relDelete(PersistenceManager.java:158) at org.neo4j.kernel.impl.core.NodeManager.deleteRelationship(NodeManager.java:808) at org.neo4j.kernel.impl.core.RelationshipImpl.delete(RelationshipImpl.java:164) at org.neo4j.kernel.impl.core.RelationshipProxy.delete(RelationshipProxy.java:50) at Graphing.Graph.deleteRelationships(Graph.java:1234) Thanks, George -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: 15 December 2010 10:44 To: Neo4j user discussions Subject: Re: [Neo4j] InvalidRecordException exception Hi George, Could you provide the full stacktrace for the exception. Regards, Johan On Wed, Dec 15, 2010 at 11:33 AM, George Ciubotaru george.ciubot...@weedle.com wrote: Hi David, We've build our own REST service in front of Neo4j graph to interact with it from a different environment. The operations are simple: - create node (each node has a single property attach to it that define its type) ... Transaction tx = graphDb.beginTx(); try { Node gNode = graphDb.createNode(); gNode.setProperty(NodeTypeDefString, vNodetype.convertToInt()); nodeId = gNode.getId(); tx.success(); } ... finally { tx.finish(); } ... - delete node (together with all its relationships): ... Transaction tx = graphDb.beginTx(); try { Node node = graphDb.getNodeById(nodeId); deleteNodeRelationships(node); node.removeProperty(NodeTypeDefString); node.delete(); tx.success(); } ... finally { tx.finish(); } ... - create relationships ... Transaction tx = graphDb.beginTx(); try { ... leftSideNode.createRelationshipTo(rightSideNode, relationshipType); tx.success(); } ... finally { tx.finish(); } ... - delete relationships between 2 nodes (of certain type and direction) ... Transaction tx = graphDb.beginTx(); try { IterableRelationship relations = leftSideNode.getRelationships(relationshipType, direction); for (Relationship relation : relations
Re: [Neo4j] InvalidRecordException exception
This will still happen in the 1.2.M05 release. I just wanted to make sure I linked the stacktrace's line numbers to the right part of the code since that exception being thrown at a different place in the delete method could mean there are other problems. -Johan On Wed, Dec 15, 2010 at 1:42 PM, George Ciubotaru george.ciubot...@weedle.com wrote: Yes, the version I'm currently using is 1.1. Shall I understand that this kind of issue shouldn't occur in 1.2.M05? For the moment I'll take the pessimistic approach by guarding against (as in the example you gave) to assure that this is the reason and then I'll just accept the exception. Thank you for your quick and detailed response. Best regards, George -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: 15 December 2010 12:23 To: Neo4j user discussions Subject: Re: [Neo4j] InvalidRecordException exception Sorry, should also have asked what Neo4j version you use but guessing it is 1.1 or early milestone release? If so I think the problem is caused by two or more concurrent transactions running delete on the same relationship. If two transactions get a reference to the same relationship and concurrently delete that relationship it is possible for a InvalidRecordException to be thrown instead of a NotFoundException since the write lock is grabbed after the relationship has been verified to exist. Solution is either to accept the exception or to guard against it by first acquiring a read lock on the relationship before invoking relationship.delete(). Code example how to do this: GraphDatabaseService gdb; // the graph db Relationship relationship; // the relationship to delete LockManager lockManager = ((EmbeddedGraphDatabase) gdb).getConfig().getLockManager(); lockManager.getReadLock( relationship ); try { relationship.delete(); } finally { lockManager.releaseReadLock( relationship ); } Regards, Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Reference node pains.
Hi Marko, On Fri, Dec 10, 2010 at 7:35 PM, Marko Rodriguez okramma...@gmail.com wrote: Hello. I have one question and a comment: QUESTION: Is the reference node always id 0 on a newly created graph? Yes. COMMENT: By chance, will you guys remove the concept of a reference node into the future. I've noticed this to be a pain in the side for people moving between various graph systems. Going from Neo4j to iGraph to TinkerPop to etc. The reference node, if the user is not conscious, begins to build as data is migrated into and from Neo4j graphs. And what ensues is a data bug. Perhaps a GraphDatabaseServer = new GraphDatabaseService(String directory, boolean createReferenceNode). ...? The reference node is very helpful in certain use-cases. The current implementation could however be improved. Would having the option to create a graph without the reference node solve the problems you are experiencing? -Johan Thanks, Marko. http://markorodriguez.com http://tinkerpop.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance DB with large datasets
Since the small graph works well and it looks like you are performing writes together with reads a possible cause could be OS writing out dirty pages to disk when it should not. Have a look at http://wiki.neo4j.org/content/Linux_Performance_Guide While running the test execute: #watch grep -A 1 dirty /proc/vmstat to check if the numbers for nr_dirty and nr_writeback increases. If that is the case you need to change the ratio settings as described in the guide. Regards, Johan On Tue, Dec 7, 2010 at 7:22 PM, Marius Kubatz marius.kub...@udo.edu wrote: Hi, there is still no difference in the performance, which is somewhat disturbing. I cant't see the allocation of nioneo memory mapping in the java process at all. It goes up to the heap size and then stops there. Marius 2010/12/7 Marius Kubatz marius.kub...@udo.edu: Hi Peter, thank you very much for your quick reply, unfortunately there is no messages.log, seems I have an older db version. I'm sending you the ls dump from the directory: total 5318580 11 active_tx_log 4096 lucene 4096 lucene-fulltext 27 neostore 9 neostore.id 34954011 neostore.nodestore.db 9 neostore.nodestore.db.id 1917225350 2 neostore.propertystore.db 133 neostore.propertystore.db.arrays 9 neostore.propertystore.db.arrays.id 190425 neostore.propertystore.db.id 10485 neostore.propertystore.db.index 9 neostore.propertystore.db.index.id 10449 neostore.propertystore.db.index.keys 9 neostore.propertystore.db.index.keys.id 2047597776 neostore.propertystore.db.strings 30790905 neostore.propertystore.db.strings.id 901093347 neostore.relationshipstore.db 149433 neostore.relationshipstore.db.id 20 neostore.relationshiptypestore.db 9 neostore.relationshiptypestore.db.id 215 neostore.relationshiptypestore.db.names 9 neostore.relationshiptypestore.db.names.id 2097160 nioneo_logical.log.1 4 nioneo_logical.log.active 88 tm_tx_log.1 29365 tm_tx_log.2 I have 3.848.862 nodes and 53.355.402 relationships in my graph. thus I created the following neo4j props file: neostore.nodestore.db.mapped_memory=30M neostore.relationshipstore.db.mapped_memory=1685M neostore.propertystore.db.mapped_memory=1000M neostore.propertystore.db.strings.mapped_memory=1000M neostore.propertystore.db.arrays.mapped_memory=0M I have 8GB Ram and gave JavaVM is running with -Xmx2048m , and the mapping should consume 4GB. Just started the experiment again the first run is traversing a neighborhood of : 124 nodes and 2.279.166 edges, so I'm very curious how this will end :) Thanks for your help! Regards Marius -- Programs must be written for people to read, and only incidentally for machines to execute. - Abelson Sussman, SICP, preface to the first edition ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] strange performance
My guess would be that the long pause times is because you start traversing parts of the graph that is not cached at all. The memory mapped buffers are assigned 1800M but the db is much larger than that. Solution would be to install more memory or switch to a SSD. You could also try lower heap size and switch to cache_type=weak freeing up some memory for the memory mapped buffers. Regards, Johan On Tue, Nov 30, 2010 at 11:43 AM, Martin Grimmer martin.grim...@unister.de wrote: Hi, here the extra output: Physical mem: 3961MB Heap size: 1809MB store_dir=/neo4j_database/ rebuild_idgenerators_fast=true neostore.propertystore.db.index.keys.mapped_memory=1M logical_log=/neo4j_database//nioneo_logical.log neostore.propertystore.db.strings.mapped_memory=69M neostore.propertystore.db.arrays.mapped_memory=1M neo_store=/neo4j_database//neostore neostore.relationshipstore.db.mapped_memory=1372M neostore.propertystore.db.index.mapped_memory=1M create=true neostore.propertystore.db.mapped_memory=275M dump_configuration=true neostore.nodestore.db.mapped_memory=91M dir=/neo4j_database//lucene-fulltext Am 23.11.2010 11:07, schrieb Johan Svensson: Hi, Could you add the following configuration parameter: dump_configuration=true and send the output printed to standard out when starting up. Other useful information would be some thread dumps while executing a query that takes long time (send kill -3 signal to the process). Regards, Johan On Tue, Nov 23, 2010 at 10:53 AM, Martin Grimmer martin.grim...@unister.de wrote: Hello, while running my benchmark i did the following: added: -verbose:gc to see gc usage run sar -u 1 100 and here are the results: on a 4 core cpu CPU %user %nice %system %iowait %steal %idle all 0,75 0,00 0,62 18,81 0,00 79,82 - Scaled to 1 core: actually the cpu does nothing to compute the queries. Only 0.75% * 4 = 3% CPU usage for a single core, and 18,81% * 4 = 75,24% io wait for a single core, the rest is divided into system and idle. The gc output: ... [GC 806876K-326533K(1852416K), 0.2419270 secs] ... many queries ... [GC 873349K-423494K(1837760K), 0.3257520 secs] ... many queries ... [GC 956678K-502630K(1643648K), 0.3619280 secs] ... many queries ... [GC 839654K-551462K(1686720K), 0.3088770 secs] ... So its not the GC. Thanks to all of you, ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] strange performance
Hi, Could you add the following configuration parameter: dump_configuration=true and send the output printed to standard out when starting up. Other useful information would be some thread dumps while executing a query that takes long time (send kill -3 signal to the process). Regards, Johan On Tue, Nov 23, 2010 at 10:53 AM, Martin Grimmer martin.grim...@unister.de wrote: Hello, while running my benchmark i did the following: added: -verbose:gc to see gc usage run sar -u 1 100 and here are the results: on a 4 core cpu CPU %user %nice %system %iowait %steal %idle all 0,75 0,00 0,62 18,81 0,00 79,82 - Scaled to 1 core: actually the cpu does nothing to compute the queries. Only 0.75% * 4 = 3% CPU usage for a single core, and 18,81% * 4 = 75,24% io wait for a single core, the rest is divided into system and idle. The gc output: ... [GC 806876K-326533K(1852416K), 0.2419270 secs] ... many queries ... [GC 873349K-423494K(1837760K), 0.3257520 secs] ... many queries ... [GC 956678K-502630K(1643648K), 0.3619280 secs] ... many queries ... [GC 839654K-551462K(1686720K), 0.3088770 secs] ... So its not the GC. Thanks to all of you, -- * Martin Grimmer * Developer, Semantic Web Project, IT Unister GmbH Barfußgässchen 11 | 04109 Leipzig Telefon: +49 (0)341 49288 5064 martin.grim...@unister.de mailto:%0a%20%20martin.grim...@unister.de www.unister.de http://www.unister.de Vertretungsberechtigter Geschäftsführer: Thomas Wagner Amtsgericht Leipzig, HRB: 19056 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] JTA support for Neo
This looks great, very good work! I would like to get this merged into trunk after we releases 1.2 (not after this iteration but after the next one). The changes looks minimal to me and hopefully there are no problems going forward with the current design. Looking forward to the guide so me and others can try this out. Would it be possible to continue investigate this and see how these changes would work in a Spring environment? The goal would be to run on an external TM, in Spring (using annotated transactions) and have both Neo4j and MySQL (or some other resource) participate in the same transaction. Regards, Johan On Sat, Nov 20, 2010 at 6:40 PM, Chris Gioran chris.gio...@gmail.com wrote: IMHO you should start a branch in the SVN so others can look at the code. So, at https://svn.neo4j.org/laboratory/users/cgioran/neo4j-kernel-jta/ you can find the kernel component with my changes incorporated. The classes added are org.neo4j.kernel.impl.transaction.TransactionManagerImpl org.neo4j.kernel.impl.transaction.TransactionManagerService that are the hooks for providing custom transaction managers. The first is an extension of javax.transaction.TransactionManager adding support for startup and shutdown, an operation present in all tx managers but not part of their API. This provides the ability to plugin custom implementations in the TxModule. The second is a convenience class that is extended by tx managers that are to be provided as a service. Also, changes are present in org.neo4j.kernel.impl.transaction.TxModule for using this new way of doing things, org.neo4j.kernel.impl.transaction.TxManager org.neo4j.kernel.impl.transaction.ReadOnlyTxManager for them to fit in this and org.neo4j.kernel.EmbeddedGraphDbImpl org.neo4j.kernel.Config to bind them. This fork is (or should be) completely compatible with the official kernel, so it can be used as a drop in replacement. Any deviation is a bug and if reported it will be fixed. The second project is at https://svn.neo4j.org/laboratory/users/cgioran/JOTMService/ and is a sample implementation of a tx manager service for JOTM. To use this, build it, add the resulting jar to your classpath and, if you are using the new jta fork of the kernel, you can pass a configuration parameter of tx_manager_impl=jotm to your EmbeddedGraphDatabase and presto!, if all is well you will be using a JOTM TxManager to do your thing. Of course, the jotm libraries must be also in your classpath, version 2.1.9 If this way of doing things is met with approval, I will write a complete guide to using the above, implementation and design details and as a result a how to for adding more external tx managers. There is more to come. CG ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] JTA support for Neo
Hi Chris, On Tue, Nov 9, 2010 at 7:34 PM, Chris Gioran chris.gio...@gmail.com wrote: Chris, Awesome! I think the next step would be to start testing things when neo4j needs to recover, rollback etc, I think this is where the problems arise :) Also, any chance of making a maven project out of it and having the project as a test component somewhere in the Svn repo, so it can be run as part of the QA process for releases etc? OK, so the plan I have in my mind is this: - Do runs to see what problems exist when rolling back/starting in recovery mode with the only resource being neo. - See how the whole thing works when another XA compatible resource is thrown in the mix, probably a RDBMS, checking that 2PC works. Yes, 2PC is what needs to be tested since on a 1PC the xa resource can during recovery figure out if a transaction should be committed or not. For a prepared 2PC transaction the global TM has to tell the resource to commit or rollback during recovery. Create a test that does the following: o create a transaction and enlist two resources (for example nioneodb and lucene-index, create a node and index it) o let the TM prepare both resources o let the TM send commit to one resource but crash the system before commit is sent to the other resource The system is now in a inconsistent state. One resource will have committed the changes and the other is just prepared. The global TM should detect this while doing recovery and invoke recover() on the appropriate resources and tell them what to do with the transactions that are in prepared state. How the global TM should get hold of a XaResource (creating it if needed) to invoke recover while investigating its transaction log is not really specified. Everyone seems to hack their own solution (JNDI, serialization to disk etc). If you can get the above test case working using other TMs together with Neo4j would be great! Regards, Johan - Find the least intrusive way of making neo fit in the picture, in terms of configuration/code changes etc, approve and commit those. - Write test cases and a maven project so that it can be integrated in the release cycle to be checked for correct functionality. - After that probably I would like to fill in the gaps so that from an app server I can do a container managed tx over a jdbc connection and a neo connection. After all, this is the ultimate purpose of this exercise. I will fill you in as I go through each of the above. Thanks for your time. cheers, CG ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] neoviz + kernel error
Hi, If it is a very old store that has been running on pre 1.0 beta releases download Neo4j 1.0 and perform a startup+shutdown. After that you will be able to run 1.1 or the current milestone/snapshot releases. If you have been running in HA mode and switched back to for example the 1.1 release you will also get that exception and downgrading from newer to older release is not automatically supported. Regards, Johan On Tue, Nov 9, 2010 at 1:46 AM, Karen Nomorosa karen.nomor...@reardencommerce.com wrote: Hi all, Tried running the neo-graphviz component and got the following error: org.neo4j.kernel.impl.nioneo.store.IllegalStoreVersionException: Store version [NeoStore v0.9.6]. Please make sure you are not running old Neo4j kernel towards a store that has been created by newer version of Neo4j. at org.neo4j.kernel.impl.nioneo.store.NeoStore.versionFound(NeoStore.java:355)... So I looked at NeoStore.java of the latest kernel source (at /components/kernel/trunk/src/main/java/org/neo4j/kernel/impl/nioneo/store/NeoStore.java) and it seems to only check for v0.9.5, throwing an exception for any other version. Would there be a quick workaround for this? Thanks, Karen Karen Joy Nomorosa Semantic Analyst Project Qhttp://wiki/display/pm/WhatIsProjectQ Tell a little... Get a lot... ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Concurrency issue/exception with Neo 1.1
In 1.1 we did a memory optimization that introduced a bug in locking. This has been fixed in 1.2-SNAPSHOT and latest milestone release. Could you try and run with 1.2.M02 and see if that helps? To diagnose the problem further we would need to dump full lock information. Invoke LockManager.dumpAllLocks() when you catch the DeadlockDetectedException to see full information on all locks currently being held (make sure you invoke dumpAllLocks before you finish the transaction). -Johan On Thu, Nov 4, 2010 at 2:09 PM, rick.bullo...@burningskysoftware.com wrote: Hi, Johan. No, we're just using standard Neo transaction mechanism with no explicit locking. What is occurring is that transaction A attempts to delete all relationships of type R1 from node N1, then it adds relationships of type R1 to node N1, perhaps to the same nodes from which they were deleted, perhaps not. In parallel, a different transaction B attempts something similar on node N1, but it only deletes/adds relationships of type R2. Both blocks of code are initiated by servlet requests and are wrapped in Neo transactions. Thanks in advance for any suggestions to diagnose. Rick Original Message Subject: Re: [Neo4j] Concurrency issue/exception with Neo 1.1 From: Johan Svensson [1]jo...@neotechnology.com Date: Thu, November 04, 2010 6:25 am To: Neo4j user discussions [2]u...@lists.neo4j.org Hi Rick, Are you grabbing read locks manually on the relationship? If the transaction has a read lock on relationship 1795 and wants to upgrade it to a write lock but it has to wait because of other transactions having read locks on that relationship. This could lead to deadlock if one of those other transaction is waiting for the lock on NodeImpl 17. Regards, Johan On Wed, Nov 3, 2010 at 10:38 PM, Rick Bullotta [3]rick.bullo...@burningskysoftware.com wrote: The except message below is really confusing, since it *seems* to indicate that the transaction is deadlocking with itself...anyone able to shed some light on it? -Original Message- From: [4]user-boun...@lists.neo4j.org [[5]mailto:user-boun...@lists.neo4j.org] On Behalf Of Rick Bullotta Sent: Wednesday, November 03, 2010 4:19 PM To: 'Neo4j user discussions' Subject: [Neo4j] Concurrency issue/exception with Neo 1.1 Hi, all. Looking for some guidance on an issue we've encountered. If two threads attempt to delete relationships on the same node (different relationship types) it seems we get into a deadlock situation of some kind, from what we see in the exception below. Any thoughts? We're running 1.1. Thanks, Rick Internal error: org.neo4j.kernel.DeadlockDetectedException: Transaction(108)[STATUS_ACTIVE,Resources=1] can't wait on resource RWLock[RelationshipImpl #1795 of type PropertyReadPermission between Node[17] and Node[26]] since = Transaction(108)[STATUS_ACTIVE,Resources=1] - RWLock[NodeImpl#17] - Transaction(108)[STATUS_ACTIVE,Resources=1] - RWLock[RelationshipImpl #1795 of type PropertyReadPermission between Node[17] and Node[26]] at org.neo4j.kernel.impl.transaction.RagManager.checkWaitOnRecursive(RagMa nager .java:219) at org.neo4j.kernel.impl.transaction.RagManager.checkWaitOnRecursive(RagMa nager .java:247) at org.neo4j.kernel.impl.transaction.RagManager.checkWaitOn(RagManager.jav a:186 ) at org.neo4j.kernel.impl.transaction.RWLock.acquireWriteLock(RWLock.java:3 00) at org.neo4j.kernel.impl.transaction.LockManager.getWriteLock(LockManager. java: 129) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.getWriteLock(WriteTran sacti on.java:828) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.disconnectRelationship (Writ eTransaction.java:717) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.relDelete(WriteTransac tion. java:704) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaConnection$RelationshipEventC onsum erImpl.deleteRelationship(NeoStoreXaConnection.java:262) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$NioNeoDbResou rceCo nnection.relDelete(NioNeoDbPersistenceSource.java:375) at org.neo4j.kernel.impl.persistence.PersistenceManager.relDelete(Persiste nceMa nager.java:158) at org.neo4j.kernel.impl.core.NodeManager.deleteRelationship(NodeManager.j ava:8 08) at org.neo4j.kernel.impl.core.RelationshipImpl.delete(RelationshipImpl.jav a:164 ) at org.neo4j.kernel.impl.core.RelationshipProxy.delete(RelationshipProxy.j
Re: [Neo4j] Concurrency issue/exception with Neo 1.1
The lock dump is not consistent with the cycle detected that causes the exception. Are there other transactions running concurrently that had a chance to finish between DDE being thrown and dumpAllLocks being invoked? Since it is repeatable maybe you can send a test case that triggers the problem? On Thu, Nov 4, 2010 at 4:57 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: One other thing to consider: In this scenario, relationship type X between Node A and Node B might be deleted and then re-created in the same transaction, if that matters at all. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Rick Bullotta Sent: Thursday, November 04, 2010 11:55 AM To: 'Neo4j user discussions' Subject: Re: [Neo4j] Concurrency issue/exception with Neo 1.1 Still have the problem in 1.2.M02 (very repeatable, which is a good thing). Here is the exception + lock dump information you requested: Transaction(36)[STATUS_ACTIVE,Resources=1] can't wait on resource RWLock[Relationship[1795]] since = Transaction(36)[STATUS_ACTIVE,Resources=1] - RWLock[Node[17]] - Transaction(36)[STATUS_ACTIVE,Resources=1] - RWLock[Relationship[1795]] Total lock count: readCount=0 writeCount=2 for Lockable relationship #1582 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,2w) Total lock count: readCount=0 writeCount=2 for Lockable relationship #1581 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,2w) Total lock count: readCount=0 writeCount=1 for Lockable relationship #1579 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,1w) Total lock count: readCount=0 writeCount=3 for Lockable relationship #1600 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,3w) Total lock count: readCount=0 writeCount=4 for Lockable relationship #1601 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,4w) Total lock count: readCount=0 writeCount=2 for Lockable relationship #1602 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,2w) Total lock count: readCount=0 writeCount=4 for Lockable relationship #1603 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,4w) Total lock count: readCount=0 writeCount=1 for Relationship[1599] Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,1w) Total lock count: readCount=0 writeCount=6 for Node[17] Waiting list: [Thread[http-8080-5,5,main](0r,0w),WRITE] Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,6w) Total lock count: readCount=0 writeCount=2 for Lockable relationship #1598 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,2w) Total lock count: readCount=0 writeCount=3 for Lockable relationship #1238 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,3w) Total lock count: readCount=0 writeCount=2 for Node[25] Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,2w) Total lock count: readCount=0 writeCount=1 for Node[26] Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,1w) Total lock count: readCount=0 writeCount=1 for Lockable relationship #1621 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,1w) Total lock count: readCount=0 writeCount=3 for Node[29] Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,3w) Total lock count: readCount=0 writeCount=1 for Lockable relationship #1620 Waiting list: Locking transactions: Transaction(36)[STATUS_MARKED_ROLLBACK,Resources=1](0r,1w) Total lock count: readCount=0 writeCount=1 for Relationship[1795] Waiting list: Locking transactions: Transaction(37)[STATUS_ACTIVE,Resources=0](0r,1w) There are no empty locks -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: Thursday, November 04, 2010 10:40 AM To: Neo4j user discussions Subject: Re: [Neo4j] Concurrency issue/exception with Neo 1.1 In 1.1 we did a memory optimization that introduced a bug in locking. This has been fixed in 1.2-SNAPSHOT and latest milestone release. Could you try and run with 1.2.M02 and see if that helps? To diagnose the problem further we would need to dump full lock information. Invoke LockManager.dumpAllLocks() when you catch the DeadlockDetectedException to see full information on all locks currently being held (make sure you invoke dumpAllLocks before you finish the transaction). -Johan On Thu, Nov 4, 2010 at 2:09 PM, rick.bullo...@burningskysoftware.com wrote
Re: [Neo4j] Upgrading store from 1.0-1.2.M02
Hi, Upgrading to newer version will work unless specified otherwise in release notes. Downgrading is not supported out of the box. So in this case upgrading from 1.0 to 1.1 or 1.2.M02 will work while downgrading from 1.2.M02 to 1.1 or 1.0 will not work. Make sure you have a clean shutdown before upgrading. Trying to upgrade from 1.0 or 1.1 to 1.2.M02 from a non clean shutdown will not work since log format has changed between 1.1 and 1.2. Regards, Johan On Sun, Oct 24, 2010 at 11:13 AM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Chris, that should be no problem, Johan knows more. However, do a backup of your existing store first and then upgrade to 1.2.M02 and check that things work. Is that ok for you? Cheers, /peter neubauer On Sat, Oct 23, 2010 at 6:06 PM, Chris Diehl di...@alumni.cmu.edu wrote: Hi Peter, I'm currently using Neo4j 1.0. Chris - Chris, normally there is no problem, but of course it depends on how old your store is. What Neo4j version are you using right now? Cheers, /peter neubauer On Fri, Oct 22, 2010 at 7:29 PM, Chris Diehl cpdi...@gmail.com wrote: Hi All, When shifting to a new release of neo4j, is there anything that needs to be done to migrate current databases created under the previous version? TIA, Chris ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] file format and persistence
Hi, The documentation is mostly in the source code so I would suggest you have a look at the org.neo4j.kernel.impl.nioneo.store package at first. There are some information about this in the user archives (for example http://www.mail-archive.com/user@lists.neo4j.org/msg01042.html). I would also recommend reading this article http://arxiv.org/abs/1004.1001v1 by Marko A. Rodriguez and Peter Neubauer. Regards, Johan On Mon, Sep 20, 2010 at 12:41 PM, Claudio Martella claudio.marte...@tis.bz.it wrote: Hi, is there any documentation explaining how the graph is serialized to disk to achieve persistence? I'm trying to understand how data is saved and how efficient it is. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] PatternNode and PatternMatcher
The pattern matcher requires a starting node to start the search from. If the pattern you are trying to match is find all persons who are married and live together you could do something like this: PatternNode person1 = new PatternNode(); PatternNode person2 = new PatternNode(); PatternNode address = new PatternNode(); person1.createRelationshipTo( person2, SPOUSE_OF, Direction.BOTH ); person1.createRelationshipTo( address, ADDRESS_AT ); person2.createRelationshipTo( address, ADDRESS_AT ); // match over all potential nodes for ( Node personNode : getAllPersons() ) { IterablePatternMatch matches = PatternMatcher.getMatcher().match( person1, personNode ); for ( PatternMatch match : matches ) { livesTogether( match.getNodeFor( person1 ), match.getNodeFor( person2 ) ); } } It may be smarter to go over all addresses instead if there are fewer address than persons: // match over address instead for ( Node addressNode : getAllAddresses() ) { IterablePatternMatch matches = PatternMatcher.getMatcher().match( address, addressNode ); for ( PatternMatch match : matches ) { livesTogether( match.getNodeFor( person1 ), match.getNodeFor( person2 ) ); } } You can also make the pattern finder do the getAllPersons() / getAllAddresses() for you if all persons and addresses are connected to a sub-reference node (in this example via a relationship of type IS_A). PatternNode personRoot = new PatternNode(); PatternNode addressRoot = new PatternNode(); person1.createRelationshipTo( personRoot, IS_A ); address.createRelationshipTo( addressRoot, IS_A ); // + the same pattern as above // start match from person sub ref node IterablePatternMatch matches = PatternMatcher.getMatcher().match( personRoot, personSubRefNode ); for ( PatternMatch match : matches ) ... // start match from address sub ref node IterablePatternMatch matches = PatternMatcher.getMatcher().match( addressRoot, addressSubRefNode ); for ( PatternMatch match : matches ) ... Regards, Johan On Thu, Sep 16, 2010 at 8:39 PM, Joshi Hemant - hjoshi hemant.jo...@acxiom.com wrote: I followed examples at http://components.neo4j.org/neo4j-graph-matching/ for graph matching problem I am working on. I have a graph where node A ß SPOUSE_OF à node B Node A ß ADDRESS_AT à node C Node B ß ADDRESS_AT à node C node D ß SPOUSE_OF à node E Node D ß ADDRESS_AT à node F Node E ß ADDRESS_AT à node F I am looking for such a pattern in my graph with following code. I thought creating pattern nodes for address and spouse would be sufficient to identify all nodes like node A. Given node A as input to this function, I expect to find other nodes who have similar relationship pattern. The code returns no matches even though I should have found node D as a pattern match. What am I missing here? Thanks in advance. - public static IterableNode findNodesWithRelationshipsTo(Node node ) { final PatternNode requested = new PatternNode(); if(node.hasRelationship(MyRelationshipTypes.ADDRESS_AT)){ for(Relationship rAddress : node.getRelationships(MyRelationshipTypes.ADDRESS_AT)){ PatternNode address = new PatternNode(); requested.createRelationshipTo(address); } } if(node.hasRelationship(MyRelationshipTypes.SPOUSE_OF)){ for(Relationship rSpouse : node.getRelationships(MyRelationshipTypes.SPOUSE_OF)){ PatternNode spouseNode = new PatternNode(); requested.createRelationshipTo(spouseNode); } } PatternMatcher matcher = PatternMatcher.getMatcher(); IterablePatternMatch matches = matcher.match( requested, node ); return new IterableWrapperNode, PatternMatch( matches ) { �...@override protected Node underlyingObjectToObject( PatternMatch match ) { return match.getNodeFor( requested ); } }; } *** The information contained in this communication is confidential, is intended only for the use of the recipient named above, and may be legally privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please resend this communication to the sender and delete the original message or any copy of it from your computer system. Thank You. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] About transactions and locking
Hi, On Thu, Aug 26, 2010 at 11:26 AM, Pierre Fouche pr.fou...@gmail.com wrote: Hi, I have a few questions about transactions and locking in Neo4j. When I read the 'Isolation' section of the transaction wiki page (http://wiki.neo4j.org/content/Transactions), I understand that Neo4j provides a unique isolation level equivalent to the TRANSACTION_READ_COMMITTED in the SQL92 terminology (that is, no dirty reads but no repeatable reads). Is this a fair statement? Neo4j does not provide any optimistic locking mechanism out-of-the-box. Is this correct? Yes this is correct. The wiki page on transactions (http://wiki.neo4j.org/content/Transactions) refers to the LockManager class (the API link is broken btw). How can I get a reference to a LockManager instance? The only way I found was to call EmbeddedGraphDatabase.getConfig().getLockManager(). However the javadoc for getConfig() states that it Will most likely be removed in future releases. True, it is not decided how to expose the locking mechanism in the API yet. For now use getConfig().getLockManager(). If you only need to grab a write lock on a node or relationship you can invoke the removeProperty method with a property key that does not exist. Suppose I opened two Neo4j databases DB1 and DB2. Can I create two transactions (one on DB1, the other on DB2) in the same thread? Yes but you will have to manage two different transactions like this: GraphDatabaseService db1 = new EmbeddedGraphDatabase( db1 ); GraphDatabaseService db2 = new EmbeddedGraphDatabase( db2 ); Transaction tx1 = db1.beginTx(); Transaction tx2 = db2.beginTx(); db1.createNode(); db2.createNode(); tx1.success(); tx2.success(); tx1.finish(); tx2.finish(); Regards, Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] java.io.IOException when starting REST server on Ubuntu 10.04 32 bit
Todd, Size of the memory mapped log buffer is final and size is set to 1024 * 1024 * 2 bytes (see MemoryMappedLogBuffer). What JVM version are you running? -Johan On Tue, Aug 24, 2010 at 11:23 AM, David Montag david.mon...@neotechnology.com wrote: Hi Todd, We would really appreciate it if you could file a ticket on https://trac.neo4j.org with all the info you can provide. Also, if you have the time, you are definitely encouraged to have a look at the source and submit a suggested patch. (see http://wiki.neo4j.org/content/About_Contributor_License_Agreement for more info) We are super busy right now, but we have not lost track of this. My suggestion is to take it as far as you can on your own first, put all the info in the ticket, and then we can look at it. Thank you. David On Tue, Aug 24, 2010 at 11:08 AM, Todd Chaffee t...@mikamai.com wrote: Any chance of getting some pointers on how to deal with this? On Mon, Aug 23, 2010 at 11:52 AM, Todd Chaffee t...@mikamai.com wrote: I'm getting an error when starting up the REST server on an Ubuntu 10.04 32bit box. Output of uname -a Linux ubuntu-server-base-v01 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 06:07:29 UTC 2010 i686 GNU/Linux I'm using the maven start script to run the REST server and here's the error I get: java.io.IOException: Invalid argument at sun.nio.ch.FileChannelImpl.map0(Native Method) at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:789) at org.neo4j.kernel.impl.transaction.xaframework.MemoryMappedLogBuffer.getNewMappedBuffer(MemoryMappedLogBuffer.java:77) at org.neo4j.kernel.impl.transaction.xaframework.MemoryMappedLogBuffer.init(MemoryMappedLogBuffer.java:46) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.open(XaLogicalLog.java:238) at org.neo4j.kernel.impl.transaction.xaframework.XaContainer.openLogicalLog(XaContainer.java:90) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:131) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:72) at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:147) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:134) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:98) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:79) at org.neo4j.rest.domain.DatabaseLocator.getGraphDatabase(DatabaseLocator.java:31) at org.neo4j.rest.domain.DatabaseLocator.getConfiguration(DatabaseLocator.java:44) at org.neo4j.rest.GrizzlyBasedWebServer.init(GrizzlyBasedWebServer.java:26) at org.neo4j.rest.GrizzlyBasedWebServer.clinit(GrizzlyBasedWebServer.java:17) at org.neo4j.rest.WebServerFactory.getDefaultWebServer(WebServerFactory.java:9) at org.neo4j.rest.Main.main(Main.java:16) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:291) at java.lang.Thread.run(Thread.java:636) According to Sun/Oracle, the transfer length is too large for the OS. http://forums.sun.com/thread.jspa?threadID=5205184 For reference, it looks like the sizes are declared in GraphDbInstance on line 62, method getDefaultParms. Is there any way I can override these sizes from the command line when starting up the REST server or does this need to be changed in the source code? Todd ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Changes not pushed to slave as expected
For the online-backup based HA the master pushes data to the slaves (and slaves can not accept writes). In the new version of HA (not generally available yet) the slaves poll the master for updates (and can accept writes). Having the master push updates to slaves may be implemented in a later release. Regards, Johan On Thu, Aug 5, 2010 at 6:27 PM, suryadev vasudev suryadev.vasu...@gmail.com wrote: Neo team, Which is correct? The slaves periodically ping the master and sync the changes. Or the Master pushes the data to slaves. Now that Neo is entering the world of high availability, can you please promote Master-Follower for identifying the relation? The non-master instances are not slaves really. They are equal to master and just out of courtesy have agreed to co-operate with a peer server. On Thu, Aug 5, 2010 at 7:28 AM, George Ciubotaru george.ciubot...@weedle.com wrote: Hi Peter, I'm working on using Neo4j with .NET technologies (through a REST service), not too much experience with Java. So, unfortunately, I won't be able to debug the code that easily (need Java tools for this that I'm not very familiar with). Thanks, George -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: 05 August 2010 14:51 To: Neo4j user discussions Subject: Re: [Neo4j] Changes not pushed to slave as expected George, not really sure what is happening, seems the log rotate is not triggering the push? Could you mount and debug the source code? Otherwise, I can see if I can recreate it tomorrow ... Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Aug 4, 2010 at 5:00 PM, George Ciubotaru george.ciubot...@weedle.com wrote: Hello, In a master-slave environment I'm expecting that the changes done on the master to be pushed to the server when the logical logs are rotated ( http://wiki.neo4j.org/content/Online_Backup_HA). I'm setting the logical log size (dataSource.setLogicalLogTargetSize(...)) to a small value to easily see the push but when the logs are rotated (I can see in the master store directory that a new nioneo_logical.log.v* is created) nothing is send to the slave. Everything works as expected when I manually call master.rotateLogAndPushToSlaves(), but I would like to avoid doing this manual job. On master side keep_logical_logs auto_rotate are set to true. I'm using kernel version 1.1 and backup version 0.6. Can anybody help me with this? Thank you in advance, George ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] BatchInserter performance
Inject the data from scratch in a new db and make sure not to break the contract of the batch inserter meaning: 1) must be used in a single threaded environment 2) shutdown invoked successfully Regards, Johan On Mon, Jul 26, 2010 at 7:15 AM, Mohit Vazirani mohi...@yahoo.com wrote: Any suggestions on how I can fix these errors and successfully create relations between the nodes using BatchInserter? - Original Message From: Mohit Vazirani mohi...@yahoo.com To: Neo4j user discussions user@lists.neo4j.org Sent: Tue, July 20, 2010 8:36:58 PM Subject: Re: [Neo4j] BatchInserter performance The exception was: org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Record[39983] not in use at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:190) at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:93) at org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.connectRelationship(BatchInserterImpl.java:187) at org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.createRelationship(BatchInserterImpl.java:172) at RelationshipInserter.main(RelationshipInserter.java:76) I checked my code and it did a clean shutdown as far as I can tell. I also called index.optimize() immediately after instantiating it to make sure. - Original Message From: Johan Svensson jo...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Mon, July 19, 2010 1:57:13 AM Subject: Re: [Neo4j] BatchInserter performance Could you provide a stacktrace for the error? Using the batch inserter and failing to invoke shutdown could be the cause of this problem. Regards, Johan On Sat, Jul 17, 2010 at 3:48 AM, Mohit Vazirani mohi...@yahoo.com wrote: That seemed to help get past that step. However, I am now seeing different error(s) when I try to create a relationship between two nodes Position[39993] requested for operation is high id[39992] or store is flagged as dirty[true] org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Position[39993] requested for operation is high id[39992] or store is flagged as dirty[true] or Record[39983] not in use The node ids being linked for the first error were 12462 and 2369702. I verified their existence by connecting through the web interface and the shell. ... ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Error in BatchInserterService
Hi, Thanks for reporting this. The elementCleaned() must be called for each element in the cache on shutdown so this is a bug. Regarding performance the batch inserter implementation is there for convince and will not perform as good as the normal batch inserter API. There may be a full implementation in the future but right now it only supports basic insertion and lookup. Regards, Johan On Thu, Jul 22, 2010 at 5:34 PM, Craig Taverner cr...@amanzi.com wrote: When I looked at BatchGraphDatabaseImpl, the impression I got was that the work to fully support the GraphDatabaseService was only partially completed. It seems it is necessary to use the BatchInserter API to get things working correctly, and if you use the GraphDatabaseService wrapper, some things silently fail. I would, however, think it should be possible to complete this implementation. Perhaps the fake transaction provided by the BatchGraphDatabaseImpl.beginTx() should be able to call the elementCleaned() method when tx.finish() is called, and flush the properties to disk? In my opinion, using the GraphDatabaseService wrapper on the BatchInserter should merely perform worse than using the real BatchInserter. I do not think it should fail to perform some key functions at all. Any opinions on this from the core team? On Thu, Jul 22, 2010 at 12:47 PM, Lagutko, Nikolay nikolay.lagu...@gersis-software.com wrote: Hi to all Find out interesting thing in BatchGraphDatabaseImpl. I tried to load a lot of data using BatchInserter Service and everything was OK. But some nodes that were created in the end didn't have any properties. So I looked to the code and find next here: Properties writes to database only when LruCache.elementCleaned() method was called. And when we calling shutdown() for service it calls clear() method of LruCache. So let's have a look to this method public synchronized void clear() { resizeInternal( 0 ); } private void resizeInternal( int newMaxSize ) { resizing = true; try { if ( newMaxSize = size() ) { maxSize = newMaxSize; } else if ( newMaxSize == 0 ) { cache.clear(); } else { maxSize = newMaxSize; java.util.IteratorMap.EntryK,E itr = cache.entrySet() .iterator(); while ( itr.hasNext() cache.size() maxSize ) { E element = itr.next().getValue(); itr.remove(); elementCleaned( element ); } } } finally { resizing = false; } } As you can see in case if we call clear() we didn't write last changes to database and only clear cache. Is it correct way? Nikolay Lagutko ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j Multiple Nodes Issue
Hi, One can use the built in locking in the kernel to synchronize and make code thread safe. Here is an example of this: https://svn.neo4j.org/examples/apoc-examples/trunk/src/main/java/org/neo4j/examples/socnet/PersonFactory.java The createPerson method guards against creation of multiple persons with the same name by creating a relationship from the reference node. After the relationship has been created (in the transaction but not yet committed) the write lock for the reference node has been acquired making sure any other running transaction has to wait for the lock to be released. Finally the index is checked to make sure some other transaction did not create the person while the current transaction was waiting for the write lock. Even simpler is to just remove a non existing property from a node or relationship. That will grab a lock on the specific node or relationship (that will be held until the transaction commits or is rolledback). Regards, Johan On Wed, Jul 21, 2010 at 4:07 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: The node id indirectly achieves this, but node id's can be recycled when nodes are deleted. Also, depending on node id may or may not work in future versions of Neo that might support sharding or distributed storage. Sounds to me like you have a more simple issue in that your UID generator isn't coded properly. It should be designed as thread safe so that you couldn't get the same UID in the first place. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Maaz Bin Tariq Sent: Wednesday, July 21, 2010 9:39 AM To: user@lists.neo4j.org Subject: [Neo4j] Neo4j Multiple Nodes Issue Hi, We are facing kind of a situation in our application using neo4j graph and we want to have your advice regarding this issue. We use graph in a way that we create a node and with every node we set a property i-e UID =value(numeric) and then we create the relationships between those nodes. According to our use case requirement there needs to be only one node in the whole graph space should exist having a UID value. That is UID should become the unique identifier for a node in the graph space. Our graph service is configured using the spring framework and all transaction handling is being managed by the spring itself. Now we are facing the problem where multiple nodes get created having the same UID because of multiple transactions running the same time and one transaction effect is not visible to other until one is committed. What we do is that we look for a node in the graph with a specific UID and if it is not there we create one. So in that case there is probability where multiple nodes could be created having the same UID if multiple transactions running same time and trying to lookup create same UID. Here I want to inquire that do we have in neo4j some kind of unique constraint be applied on a specific property that prevent multiple nodes get created having the same UID. Second, Let say if I am creating 1000 nodes and their relationships in one transaction and now I want to know that what is the performance cost if I create each node and its relationship in one separate transaction. Thanks-Maaz ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] BatchInserter performance
Could you provide a stacktrace for the error? Using the batch inserter and failing to invoke shutdown could be the cause of this problem. Regards, Johan On Sat, Jul 17, 2010 at 3:48 AM, Mohit Vazirani mohi...@yahoo.com wrote: That seemed to help get past that step. However, I am now seeing different error(s) when I try to create a relationship between two nodes Position[39993] requested for operation is high id[39992] or store is flagged as dirty[true] org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Position[39993] requested for operation is high id[39992] or store is flagged as dirty[true] or Record[39983] not in use The node ids being linked for the first error were 12462 and 2369702. I verified their existence by connecting through the web interface and the shell. This is part of what my code looks like: BatchInserter inserter = new BatchInserterImpl( graphLocation, BatchInserterImpl.loadProperties( neo4j.props.relationships ) ); LuceneIndexBatchInserter index = new LuceneIndexBatchInserterImpl(inserter); long node1 = index.getNodes(Property1, value1).iterator().next(); long node2 = index.getNodes(Property1, value2).iterator().next(); inserter.createRelationship(node1, node2, DynamicRelationshipType.withName( REL_TYPE ), null ); index.shutdown(); inserter.shutdown(); - Original Message From: Johan Svensson jo...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Fri, July 16, 2010 1:44:52 AM Subject: Re: [Neo4j] BatchInserter performance Hi, You started up the batch inserter on a store that had not been shutdown properly. You could try startup in normal non batch inserter mode and just shutdown: new EmbeddedGraphDatabase( storeDir ).shutdown(); That will do a fast rebuild of the id generators and after that the batch inserter should be able to start quickly without doing a full rebuild of the id generators. Regards, Johan On Fri, Jul 16, 2010 at 2:40 AM, Mohit Vazirani mohi...@yahoo.com wrote: Hi, I have instantiated a neo4j store (on a machine with 32GB RAM) with a root node, a sub-ref node and 400 million nodes connected to the subref node with the same relationship type. Each one of these nodes has 3 properties (int, long, String) all of which were indexed using something similar to the first loop in the following example (http://wiki.neo4j.org/content/Batch_Insert#Using_batch_inserter_together_with_indexing). . Now I have another Java app that I have written to create roughly 1 billion relationships (2nd relationship type) between these 400 million nodes. BatchInserter inserter = new BatchInserterImpl( /graphdb/neo4j-rest-db/, BatchInserterImpl.loadProperties( neo4j.props ) ); The memory mapped neo4j.props file looks as follows: neostore.nodestore.db.mapped_memory=3500M neostore.relationshipstore.db.mapped_memory=15G neostore.propertystore.db.mapped_memory=2G neostore.propertystore.db.index.mapped_memory=10M neostore.propertystore.db.index.keys.mapped_memory=10M neostore.propertystore.db.strings.mapped_memory=1G neostore.propertystore.db.arrays.mapped_memory=0M The program is invoked with the following options: java -d64 -server -Xmx24000m -cp $CLASSPATH RelationshipInserter It seems to be stuck at the BatchInserterImpl() function for over 3 hours now. JConsole shows the following stacktrace for main: Name: main State: RUNNABLE Total blocked: 0 Total waited: 0 Stack trace: sun.nio.ch.NativeThread.current(Native Method) sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:45) sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:163) - locked java.lang.obj...@284d0371 org.neo4j.kernel.impl.nioneo.store.AbstractDynamicStore.rebuildIdGenerator(AbstractDynamicStore.java:659) ) org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.makeStoreOk(CommonAbstractStore.java:425) ) org.neo4j.kernel.impl.nioneo.store.PropertyStore.makeStoreOk(PropertyStore.java:452) ) org.neo4j.kernel.impl.nioneo.store.NeoStore.makeStoreOk(NeoStore.java:295) org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.init(BatchInserterImpl.java:105) ) RelationshipInserter.main(RelationshipInserter.java:43) I have been using this program mainly for testing and maybe one in ten attempts will run successfully to completion. Any thoughts on what I am doing wrong here? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] BatchInserter performance
Hi, You started up the batch inserter on a store that had not been shutdown properly. You could try startup in normal non batch inserter mode and just shutdown: new EmbeddedGraphDatabase( storeDir ).shutdown(); That will do a fast rebuild of the id generators and after that the batch inserter should be able to start quickly without doing a full rebuild of the id generators. Regards, Johan On Fri, Jul 16, 2010 at 2:40 AM, Mohit Vazirani mohi...@yahoo.com wrote: Hi, I have instantiated a neo4j store (on a machine with 32GB RAM) with a root node, a sub-ref node and 400 million nodes connected to the subref node with the same relationship type. Each one of these nodes has 3 properties (int, long, String) all of which were indexed using something similar to the first loop in the following example (http://wiki.neo4j.org/content/Batch_Insert#Using_batch_inserter_together_with_indexing). Now I have another Java app that I have written to create roughly 1 billion relationships (2nd relationship type) between these 400 million nodes. BatchInserter inserter = new BatchInserterImpl( /graphdb/neo4j-rest-db/, BatchInserterImpl.loadProperties( neo4j.props ) ); The memory mapped neo4j.props file looks as follows: neostore.nodestore.db.mapped_memory=3500M neostore.relationshipstore.db.mapped_memory=15G neostore.propertystore.db.mapped_memory=2G neostore.propertystore.db.index.mapped_memory=10M neostore.propertystore.db.index.keys.mapped_memory=10M neostore.propertystore.db.strings.mapped_memory=1G neostore.propertystore.db.arrays.mapped_memory=0M The program is invoked with the following options: java -d64 -server -Xmx24000m -cp $CLASSPATH RelationshipInserter It seems to be stuck at the BatchInserterImpl() function for over 3 hours now. JConsole shows the following stacktrace for main: Name: main State: RUNNABLE Total blocked: 0 Total waited: 0 Stack trace: sun.nio.ch.NativeThread.current(Native Method) sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:45) sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:163) - locked java.lang.obj...@284d0371 org.neo4j.kernel.impl.nioneo.store.AbstractDynamicStore.rebuildIdGenerator(AbstractDynamicStore.java:659) org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.makeStoreOk(CommonAbstractStore.java:425) org.neo4j.kernel.impl.nioneo.store.PropertyStore.makeStoreOk(PropertyStore.java:452) org.neo4j.kernel.impl.nioneo.store.NeoStore.makeStoreOk(NeoStore.java:295) org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.init(BatchInserterImpl.java:105) RelationshipInserter.main(RelationshipInserter.java:43) I have been using this program mainly for testing and maybe one in ten attempts will run successfully to completion. Any thoughts on what I am doing wrong here? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How to traverse by the number of relationships between nodes?
Hi, I would not recommend to use large amounts of different (dynamically created) relationship types. It is better to use well defined relationship types with an additional property on the relationship whenever needed. The limit is actually not 64k but 2^31, but having large amounts of relationship types like 10k-100k+ will reduce performance and consume a lot of memory. Regards, Johan On Thu, Jul 8, 2010 at 4:13 PM, Max De Marzi Jr. maxdema...@gmail.com wrote: Can somebody verify the max number of relationship types? If it is 64k, is there a way to increase it without significant effort? I believe you can have something like 64k relationship types, so using the relationship type for the route name is possible. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
The 1.2 release is scheduled to be released in Q4 (most likely in November). Regarding implementations running on large graphs using Neo4j there have been several mentions of that on the list so you could try search the user archives (http://www.mail-archive.com/user@lists.neo4j.org/). For example: http://lists.neo4j.org/pipermail/user/2010-April/003493.html Regarding sharding, if your domain is shardable you can use the same domain-specific sharding scheme with a graph database as you would using some other solution (RDBMS, document store etc). Traversals over shards would then have to be managed by the application that knows about the domain and sharding scheme in place. -Johan On Wed, Jun 9, 2010 at 3:13 AM, Biren Gandhi biren.gan...@gmail.com wrote: Any timeline guidance on release 1.2? We would like to learn about any implementation supporting following claim on the main neo4j page. Does anyone know about sharding schemes and how would traverser work with distributed graph? - massive scalability. Neo4j can handle graphs of several *billion*nodes/relationships/properties on a single machine and can be sharded to scale out across multiple machines. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Online Backup : wich package?
On Wed, Jun 9, 2010 at 1:37 PM, Batistuta Gabriel batistutagabrielf...@gmail.com wrote: Thanks. If I understand the tutorial of neo4j and your expaination, this part of code is correct : //create the original graph neo = new EmbeddedGraphDatabase(CONSTANTS.GRAPH_PATH); graph = ObjectGraphFactory.instance().get(neo); for (String datasource : new String[]{nioneodb, lucene}) { neo.getConfig().getTxModule().getXaDataSourceManager() .getXaDataSource( datasource ).keepLogicalLogs( true ); } //create the backup graph EmbeddedGraphDatabase neo = GraphJo4neo.getGraphDatabaseService(); EmbeddedGraphDatabase backupGraphDb = new EmbeddedGraphDatabase( CONSTANTS.GRAPH_BACKUP_PATH ); IndexService backupIndexService = new LuceneIndexService( backupGraphDb ); Backup backup = new Neo4jBackup( neo, backupGraphDb, new ArrayListString() { { add( nioneodb ); add( lucene ); } } ); try { backup.enableFileLogger(); backup.setLogLevelDebug(); backup.doBackup(); } catch (IOException e) {} Basically, this part of code is right, isn'it? Yes that looks right. -Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Online Backup : wich package?
What was the output of System.out.println(System.getProperty(java.class.path)); as Tobias asked you to do? On Wed, Jun 9, 2010 at 1:56 PM, Batistuta Gabriel batistutagabrielf...@gmail.com wrote: However, I obtain this error : java.lang.NoSuchMethodError: org.neo4j.onlinebackup.AbstractResource.init(Lorg/neo4j/kernel/impl/ transaction/x aframework/XaDataSource;)V at org .neo4j .onlinebackup .EmbeddedGraphDatabaseResource .init(EmbeddedGraphDatabaseResource.java:31) at org.neo4j.onlinebackup.Neo4jBackup.doBackup(Neo4jBackup.java:164) at util.BackupNeo4j.run(BackupNeo4j.java:49) ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
I just added code in trunk so block size for string and array store can be configured when the store is created. This will be available in the 1.1 release but if you want to try it out now use 1.1-SNAPSHOT and create a new store like this: MapString,String config = new HashMapString, String(); config.put( string_block_size, 60 ); config.put( array_block_size, 300 ); // create a new store with string block size 60 and array block size 300 new EmbeddedGraphDatabase( path-to-db-that-do-not-exist, config ).shutdown(); The default value (120 bytes) was picked to fit common/avg size string/array properties in one block since it will be slower to load a property that is spread out on many blocks. Since datasets vary a lot in string size / array size and the like I think it is better to have it configurable at creation time. When tweaking these values remember that strings will consume twice the string length in bytes so a string block size of 60 will be able to fit a string of length 30 in a single block. Regarding scaling 1.0 and 1.1 releases have a limit of 4 billion records / store file so if you need to store 4 billion strings you have to make sure every string fits in a single block. This limit will be increased to 32 billion or more in the 1.2 release. -Johan On Mon, Jun 7, 2010 at 4:27 PM, Biren Gandhi biren.gan...@gmail.com wrote: Similar issue on my side as well. Test data is ok, but production data (100 million+ objects, 200 relationships per object and 10 properties per object, with multi-million queries per day about search and traversal) would need clear disk sizing calculations due to iops and other hardware limits in a monolithic storage model. Has anyone been able to use neo4j succeessfully in scaling needs similar to mentioned avove? -b On Jun 7, 2010, at 4:45 AM, Craig Taverner cr...@amanzi.com wrote: Is there a specific constrain on disk space? Normally disk space isn't a problem... it's cheap and there's usually loads of it. Actually for most of my use cases the disk space has been fine. Except for one data source, that surprised me by expanding from less than a gig of original binary data, to over 20GB database. While this too can be managed, it was just a sample, and so I have yet to see what the customers 'real data' will do to the database (several hundred times larger, I'm expecting). When we get to that point we will need to decide how to deal with it. Currently we 'solve' the issue by allowing the user to filter out data on import, so we don't store everything. This will not satisfy all users, however. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
Hi, These are the current record sizes in bytes that can be used to calculate the actual store size: nodestore: 9 relationshipstore: 33 propertystore: 25 stringstore: 133 arraystore: 133 All properties except strings and arrays will take a single propertystore record (25 bytes). A string or array property will use one record from the propertystore then as many blocks needed from the string/array store file (each block of 133 bytes can store 120 bytes of data). This means if all your strings are in 120 bytes multiples in size you will make very efficient use of the store file while if they are empty you will not make very good use of the space (exactly like a normal filesystem taking up space for empty files). -Johan On Fri, Jun 4, 2010 at 9:15 AM, Mattias Persson matt...@neotechnology.com wrote: That formula is correct regarding nodes and relationships, yes. When properties comes into play another formula would, of course, have to be applied. Depending on property types and length of keys/string values it is different. It could be good though with a formula/tool to calculate that. 2010/6/4, Biren Gandhi biren.gan...@gmail.com: In that case, what are the ways to estimate storage capacity numbers? Basic formula of nodes*9 + edges*33 doesn't seem like a practical one. On Wed, Jun 2, 2010 at 11:26 PM, Mattias Persson matt...@neotechnology.comwrote: String properties are stored in blocks so even if you have tiny string values each property value will occupy a full block (30 or 60 bytes, can someone correct me here?). That's what taking most of your space IMHO ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Tell neo to not reuse ID's
Hi, Maybe we should add a configuration option so that ids are not reused. Martin, you could try patch the code in org.neo4j.kernel.impl.nioneo.store.IdGeneratorImpl: === --- IdGeneratorImpl.java(revision 4480) +++ IdGeneratorImpl.java(working copy) @@ -145,16 +145,6 @@ { throw new IllegalStateException( Closed id generator + fileName ); } -if ( defragedIdList.size() 0 ) -{ -long id = defragedIdList.removeFirst(); -if ( haveMore defragedIdList.size() == 0 ) -{ -readIdBatch(); -} -defraggedIdCount--; -return id; -} That is in the nextId() method line 148 just remove or comment out the if clause that grabs id from the list of reused ids. Some unit tests will fail but that change will cause ids not to be reused. -Johan On Thu, Jun 3, 2010 at 1:05 PM, rick.bullo...@burningskysoftware.com wrote: Hi, Craig. Not crazy at all. We're doing something similar to flag archived nodes. Instead of deleting context(since we want to infrequently access the nodes/relationships, but only under certain types of queries/traversals), instead we mark the nodes/relationships with a boolean property. Not as space efficient, but in our case, since we need to retain the full state, it works fine. Original Message Subject: Re: [Neo4j] Tell neo to not reuse ID's From: Craig Taverner [1]cr...@amanzi.com Date: Wed, June 02, 2010 7:10 pm To: Neo4j user discussions [2]u...@lists.neo4j.org Here is a crazy idea that probably only works for nodes. Don't actually delete the nodes, just the relationships and the node properties. The skeleton node will retain the id in the table preventing re-use. If these orphans are not relevant to your tests, this should have the effect you are looking for. On Wed, Jun 2, 2010 at 8:17 PM, Martin Neumann [3]m.neumann.1...@gmail.comwrote: Hej, Is it somehow possible to tell Neo4j not to reuse id's at all? Im running some experiments on Neo4j and I want to add and delete the nodes and relationships. To make sure that I can repeat the same experiment I create a log containing the ID's of the nodes i want to delete. To make sure that I can rerun the experiment each node I add has to have the same ID in each experiment. If ID's can be reused that is not always the case thats why I need to turn it off or work around it. hope for your help cheers Martin ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Urgent: Block Not In Use error
Rick, There is no ordinary way to NOT run recovery on startup if the system crashes. The only way for that to happen is if something extraneous to Neo4j has modified the filesystem in between runs. For example if the logical files are removed after a crash, then starting up could lead to no recovery followed by block not in use behavior you describe. Another example is if you run a filesystem that doesn't honor the POSIX fdatasync() contract (most popular filesystems can be configured to do so...). Could you explain a bit more what happened and some info of your configuration such as: o accidental shutdown means kill/powerfailure/non clean shutdown/clean shutdown etc? o what filesystem and configuration of it -Johan On Thu, Jun 3, 2010 at 5:38 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: We had an accidental shutdown of a running Neo instance, and there was no automatic recovery on startup. We are getting a bunch of Block Not In Use errors such as: Block not inUse[0] blockId[179414] Is there a way to recover from this? Is this a bug? If so, is there a fix available? Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Urgent: Block Not In Use error
No we do nothing like that so I do not think that is the problem. I will have a look at your store to see if I can get any clues to what the problem might be. -Johan On Thu, Jun 3, 2010 at 7:23 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: One possible hint? http://social.technet.microsoft.com/Forums/fi-FI/w7itprogeneral/thread/df935 a52-a0a9-4f67-ac82-bc39e0585148 -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: Thursday, June 03, 2010 1:11 PM To: Neo4j user discussions Subject: Re: [Neo4j] Urgent: Block Not In Use error That setup should not be a problem. Anything else you can think of that was out of the ordinary before the task got terminated or after (stacktraces, disk full, concurrent process trying to access the same store files etc)? You can contact me off-list if it would be possible for me to have a look at the store. -Johan On Thu, Jun 3, 2010 at 6:25 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: Hi, Johan. I might have missed the recovery attempt being logged, but here's the basics: - it was a non-clean shutdown but not a powerdown (task terminated). - the nodes that seem to be exhibiting the odd behavior were not being written to at the time, but that doesn't necessarily mean anything - the OS is Windows 7/64-bit - the storage is a Samsung SSD, NTFS, no compression Hope that helps, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [Neo] TransactionEventHandler and Spring transaction handling
Antonis, Just committed some bug fixes in the event framework and hopefully this also solves the problem you experienced when using Spring. Could you please try the latest neo4j-kernel 1.1-SNAPSHOT to see if it works? To answer your other question the handler is called in the same thread and you can access node properties in the afterCommit() call (we changed so reads without a running transaction are possible). Regards, Johan On Thu, May 20, 2010 at 2:56 PM, Antonis Lempesis ant...@di.uoa.gr wrote: To further clarify, I run 2 tests. In the first test, my objects were configured using spring + I had the @Transactional annotation in the test method. In the second test, I configured the same objects manually and also started and commited the transaction before and after calling the test method. In both cases, my handler got a TransactionData object (not null), but in the second case tData.assignedNodeProperties().hasNext() returned true while in the first it returned false. thanks for your support, Antonis PS 2 questions: is the handler called in a different thread? And, in afterCommit() method, can I access the node properties in the TransactionData object? Since the transaction is commited (I guess finished), shouldn't I get an NotInTransaction exception? On 5/20/10 3:38 PM, Johan Svensson wrote: Hi, I have not tried to reproduce this but just looking at the code I think it is a bug so thanks for reporting it! The synchronization hook that gathers the transaction data gets registered in the call to GraphDatabaseService.beginTx() but when using Spring (with that configuration) UserTransaction (old JTA) will be called directly so no events will be collected. Will fix this very soon. -Johan On Wed, May 19, 2010 at 5:49 PM, Antonis Lempesisant...@di.uoa.gr wrote: Hello all, I have set up spring to handle transactions for neo4j (according to the imdb example) and it works fine. When I read about the new events framework, I checked out the latest revision (4421) and tried to register my TransactionEventHandler that simply prints the number of created nodes. The weird thing is that when I test this in a simple junit test case, the TransactionData I get contains the correct data. When I do the same thing using the spring configuration, the TransactionData is empty. Any ideas? Thanks, Antonis ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Compacting files?
Alex, You are correct about the holes in the store file and I would suggest you export the data and then re-import it again. Neo4j is not optimized for the use case were more data is removed than added over time. It would be possible to write a compacting utility but since this is not a very common use case I think it is better to put that time into producing a generic export/import dump utility. The plan is to get a export/import utility in place as soon as possible so any input on how that should work, what format to use etc. would be great. -Johan On Wed, Jun 2, 2010 at 9:23 AM, Alex Averbuch alex.averb...@gmail.com wrote: Hey, Is there a way to compact the data stores (relationships, nodes, properties) in Neo4j? I don't mind if its a manual operation. I have some datasets that have had a lot of relationships removed from them but the file is still the same size, so I'm guessing there are a lot of holes in this file at the moment. Would this be hurting lookup performance? Cheers, Alex ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance of shortest path algorithms
Hi Tobias, The problem here is that the machine has to little RAM to handle 244M relationships without reading from disk. What type of hard disk are you using? The low CPU usage and continuous reads from disk indicate that cache misses are to high resulting in many random reads from disk. I would suggest to first make sure you have a good configuration, about 2-3GB memory mapped for relationship store then only run 512M-1G java heap but this will probably not be enough. Instead you either have to get more RAM (8GB-12GB) or buy a better disk (a good SSD will increase random read performance with 50-100x). Regards, Johan On Mon, May 31, 2010 at 9:39 AM, Peter Neubauer peter.neuba...@neotechnology.com wrote: ... 2010/5/31 Tobias Mühlbauer tobias.muehlba...@gmail.com: Hi, We're currently using a Neo4j database with 3,7 million nodes and 244 million relationships. However we are facing a problem using the shortest path algorithms ShortestPath from graph-algo 0.6. Our server has a 2.8Ghz Core 2 Duo processor and 4 GB of RAM installed. Starting a shortest-path search between two arbitrary nodes can take up to half an hour. Calling the search for the second time using the same nodes it finishes in milliseconds. 2 things make me wonder: 1. Is there a way to load parts of the database into memory prior to the first search (preloading)? 2. Running the search algorithm only uses 2% of CPU/0,5MB/s read from disk. So there are resources left that are unused. What can I do to find the bottleneck? Greetings, Toby ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [Neo] Question regarding insertRelationship performance - memory issue?
Hi, Could you try to turn off logical log rotation or change the default target log size to see if that changes anything. You can do this by accessing the data source for the native store: GraphDatabaseService graphDb = new EmbeddedGraphDatabase( dbdir, ... ); TxModule txModule = ((EmbeddedGraphDatabase) graphDb).getConfig().getTxModule(); XaDataSourceManager xaDsMgr = txModule.getXaDataSourceManager(); XaDataSource xaDs = xaDsMgr.getXaDataSource( nioneodb ); xaDs.setAutoRotate( false ); // or increase log target size to 100MB (default 10MB) like this // xaDs.setLogicalLogTargetSize( 100 * 1024 * 1024l ); The page out problem could also be a Linux kernel tweaking issue were it is configured to start writing out dirty pages to early. If you type the following in a console: #watch grep -A 1 dirty /proc/vmstat then observe nr_writeback. If that value starts to increase more and more when performing writes you need to tweak your virtual memory settings. You can control when the linux kernel will start writing out dirty pages by modifying the vm.dirty_background_ratio and vm.dirty_ratio settings. The dirty_background_ratio tells how many percent of memory/pages can be dirty before a background task is started to write them out to disk. When that happens performance will drop and write performance will be heavily affected since it will result in more random writes instead of only sequential writes to the log. To change these setting edit /etc/sysctl.conf: vm.dirty_background_ratio = 80 vm.dirty_ratio = 90 Then to load the new settings type the following in a console: #sysctl -p The 80 means when 80% of pages are dirty start the background write out pages task. The other setting (at 90%) is when all writes will have to wait until it has been written to the underlying device (will kill write performance even more). Hopefully you can tweak away the page out problem by modifying these settings. Regards, Johan On Mon, May 24, 2010 at 1:28 PM, Alisa Devlic a...@appearnetworks.com wrote: Hello, My name is Alisa Devlic and I am a new neo4j mailing list subscriber. I have the following problem. I performed the following tests in Neo4j (with and without indexing): - Inserting nodes with 2 properties - Inserting 2 relationship between nodes - Get nodes by Id/name - Traversing In these tests I have increased the number of nodes I am inserting, inserting 2 relationships between each 2 nodes, doing the traversing of all these nodes and obtaining the nodes by their Id/property value. I measured the time needed to perform these operations, the resulted graph I have put in the attachment. Before performing the tests, I have looked into the performance guide and configuration settings, see 2 following links : http://wiki.neo4j.org/content/Configuration_Settings http://wiki.neo4j.org/content/Neo4j_Performance_Guide I ran the java in server mode with 512 MB heap size on Ubuntu 8.04, with the default neo4j configuration. The RAM memory of my machine is 3550 MB and the amount of memory reserved for linux and other programs is 805 MB. I performed tests with different number of nodes, ranging from 1 to 8 (I will continue with more nodes later). The problem that I found is with insert relationship performance (see attachment), that it jumps at 39000 nodes to 12 ms from 3.5 ms (at 38000 nodes) and after increasing the number of nodes further (4, 5) the time decreases, then for 7 and 8 increases again. The interesting thing is if you look at the graph, you can see that the difference between the time of 38000 and 39000 nodes is around 8seconds, then starts decreasing when increasing number of nodes further. My assumption is that here the page faults problem appears, but I don't understand why the time after 4 nodes decreases. I checked the page size at my machine and it is 4096 bytes, meaning that the number of pages for 512 MB is 128. I looked with vmstat command the memory statistics (at 39000 and 4 nodes) and I noticed that the number of page-outs (the event when pages are written to disk) increases each time I run the insert relationship command. However, there were no swap ins or swap outs. I also did not try bulk inserter yet, which I read about that to have the better performance. I did not look into previous posts of this emailing list, if you encounter such an issue before, please let me know where to look for the explanation/solution. In any case I am looking forward to your response and suggestions, Best regards, Alisa ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Johan Svensson [jo...@neotechnology.com] Chief Technology Officer, Neo Technology www.neotechnology.com
Re: [Neo] memory mapping
If a run is that long when performing traversals -server flag should be faster. Could you explain a bit more what type of traversal you are performing and what the graph looks like? Judging by the size of the store files you should be able to traverse the full graph many times in a single day on that machine. -Johan On Fri, May 21, 2010 at 1:15 PM, Lorenzo Livi lorenz.l...@gmail.com wrote: No, I use only one jvm instance for each run. My run usually last something like 1 day or 15 days. On Fri, May 21, 2010 at 1:10 PM, Johan Svensson jo...@neotechnology.com wrote: Yes, -server is usually slower the first few runs but once some information has been gathered and optimizations put in place it will be faster. Are you starting a new JVM for each traversal? ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Fwd: Node not in use exception when using tx event handler
This was indeed a bug and I just committed a fix for it in trunk. -Johan On Tue, May 18, 2010 at 6:28 PM, Garrett Smith g...@rre.tt wrote: See attached. To reproduce, run against r4415: $ java -cp PATH NodeNotInUse DBDIR create $ java -cp PATH NodeNotInUse DBDIR delete NODEID I modified src/main/java/org/neo4j/kernel/impl/core/TransactionEventsSyncHook.java to get the error info: --- src/main/java/org/neo4j/kernel/impl/core/TransactionEventsSyncHook.java (revision 4415) +++ src/main/java/org/neo4j/kernel/impl/core/TransactionEventsSyncHook.java (working copy) @@ -35,6 +35,7 @@ public void beforeCompletion() { + try { this.transactionData = nodeManager.getTransactionData(); states = new ArrayListHandlerAndState(); for ( TransactionEventHandlerT handler : this.handlers ) @@ -55,6 +56,10 @@ throw new RuntimeException( t ); } } + } catch (Throwable th) { + th.printStackTrace(); + throw new RuntimeException(th); + } } On Tue, May 18, 2010 at 6:38 AM, Johan Svensson jo...@neotechnology.com wrote: Garrett, This could be a bug. Could you please provide a test case that trigger this behavior. -Johan On Sat, May 15, 2010 at 8:46 PM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Create a ticket for it, I've tagged it for reviewing when I get back to the office, you had the great unfortune to send this right at the beginning of a 4 day Swedish holiday. If you could supply code that can reproduce it that would be even better. Cheers, Tobias On Sat, May 15, 2010 at 8:42 PM, Garrett Smith g...@rre.tt wrote: Is this something I should open a ticket for, or is it something the dev team is aware of? Or is it user error? Garrett -- Forwarded message -- From: Garrett Smith g...@rre.tt Date: Thu, May 13, 2010 at 2:52 PM Subject: Node not in use exception when using tx event handler To: Neo4j Users user@lists.neo4j.org I'm running into the exception below when I try to delete a node when first starting up a graph database. I'm experimenting with a transaction event handler. The error, however, occurs before my handler gets called. org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Node[10] not in use at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.nodeGetProperties(WriteTransaction.java:1009) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaConnection$NodeEventConsumerImpl.getProperties(NeoStoreXaConnection.java:228) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$NioNeoDbResourceConnection.nodeLoadProperties(NioNeoDbPersistenceSource.java:432) at org.neo4j.kernel.impl.persistence.PersistenceManager.loadNodeProperties(PersistenceManager.java:100) at org.neo4j.kernel.impl.core.NodeManager.loadProperties(NodeManager.java:628) at org.neo4j.kernel.impl.core.NodeImpl.loadProperties(NodeImpl.java:84) at org.neo4j.kernel.impl.core.Primitive.ensureFullLightProperties(Primitive.java:591) at org.neo4j.kernel.impl.core.Primitive.getAllCommittedProperties(Primitive.java:604) at org.neo4j.kernel.impl.core.LockReleaser.populateNodeRelEvent(LockReleaser.java:855) at org.neo4j.kernel.impl.core.LockReleaser.getTransactionData(LockReleaser.java:740) at org.neo4j.kernel.impl.core.NodeManager.getTransactionData(NodeManager.java:914) at org.neo4j.kernel.impl.core.TransactionEventsSyncHook.beforeCompletion(TransactionEventsSyncHook.java:39) at org.neo4j.kernel.impl.transaction.TransactionImpl.doBeforeCompletion(TransactionImpl.java:341) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:556) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:103) at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:410) at gv.graph.Nodes.deleteNode(Nodes.java:349) at gv.graph.NodeDelete.handle(NodeDelete.java:20) at gv.graph.MessageHandler.run(MessageHandler.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) May 13, 2010 2:42:56 PM org.neo4j.kernel.impl.transaction.TransactionImpl doBeforeCompletion WARNING: Caught exception from tx syncronization[org.neo4j.kernel.impl.core.transactioneventssynch...@edf3f6 ] beforeCompletion() May 13, 2010 2:42:56 PM org.neo4j.kernel.impl.transaction.TransactionImpl doAfterCompletion WARNING: Caught exception from tx syncronization[org.neo4j.kernel.impl.core.transactioneventssynch...@edf3f6 ] afterCompletion() Code details: URL: https://svn.neo4j.org/components/kernel/trunk Repository Root: https://svn.neo4j.org Repository
Re: [Neo] memory mapping
Yes, -server is usually slower the first few runs but once some information has been gathered and optimizations put in place it will be faster. Are you starting a new JVM for each traversal? On Fri, May 21, 2010 at 1:06 PM, Lorenzo Livi lorenz.l...@gmail.com wrote: ... and I don't have concurrency ... I'm working on a lab environment ... Best regard, Lorenzo On Fri, May 21, 2010 at 12:54 PM, Johan Svensson jo...@neotechnology.com wrote: Hi, If your traversals access properties I would suggest full memory mapping: neostore.nodestore.db.mapped_memory=1G neostore.relationshipstore.db.mapped_memory=2G neostore.propertystore.db.mapped_memory=1700M neostore.propertystore.db.strings.mapped_memory=1200M neostore.propertystore.db.arrays.mapped_memory=0M If you are running Windows OS try experiment with: use_memory_mapped_buffers=true Make sure you are running 1.6 JVM and I suggest a heap size between 3G-8G depending on how much concurrent traversal load you have. Run in server mode and CMS collector: java -d64 -server -XX:+UseConcMarkSweepGC -Xmx4000M Regards, Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Implementing new persistence source
On Wed, May 19, 2010 at 4:48 PM, Jawad Stouli ja...@citizenplace.com wrote: Hi Johan and thanks for your answer. I think that I have figured out the major concepts behind PersistenceSource and I have a partially working prototype of Neo4j using Cassandra. As you stated it, I had to make some minor modifications to Neo4j core to handle my own PersistenceSource. I really want to keep my work compatible with future versions of Neo4j, would it be possible to include back the possibility to choose that source ? Yes we can certainly do that. Some concepts remain unclear to me and I still have some unanswered questions. - Why do you use a property index ? It seems to me that it is used to store an integer id / property key correspondence and then use it to store / retrieve properties. Is it tightly coupled to the way nioneo handles properties or am I missing something more important ? Reason is it is faster to read/write an integer from/to disk than a string key. Typically you will have few unique property key names in any given system so it is an optimization to make add/remove/get property faster. - PersistenceSource, Transaction and Command have a clear role in the xaframework. But I don't really see the difference between XaDataSource and XaConnection. Yes that could have been done differently and I guess the reasons are the old JTA and XA specifications. There are discussions in progress on removing the dependency on JTA, write something new that fits better in modern today containers/frameworks (with optional support for JTA) and that would likely result in a cleaner API and implementation. - I don't understand the Logicallog and what this process is used for. To make sure every transaction that has been committed will be there if the system crashes. The logical log contains all operations performed and the data will be forced to disk before each transaction commits. The log can then be used to put the normal store files in a consistent state after a crash. Regards, Johan Thanks in advance, Jawad On Tue, May 18, 2010 at 1:22 PM, Johan Svensson jo...@neotechnology.comwrote: Hi, Have a look at org.neo4j.kernel.impl.nioneo.xa package. To implement a new persistence source start by creating new implementations of the NeoStoreXaDataSource and NeoStoreXaConnection classes. It is no longer possible to swap in a different persistence source using configuration (used to be) but if you modify the code in org.neo4j.kernel.GraphDbInstance.start method to register YourImplNeoStoreXaDataSource instead of the nioneo one (with same name) it should work. Back when we had Neo4j running on different relational databases (Postgres, Informix, MySQL) one big problem was that when the number of total relationships in the graph increased the time to figure out what relationships a specific node had also took longer time (regardless if that node had few relationships). It is important to have a getRelationships method were execution time is connected to number of relationships on that node to maintain high traversal speed as the graph increase in size. Regards, Johan On Sat, May 15, 2010 at 8:03 PM, Jawad Stouli ja...@citizenplace.com wrote: Hi everyone, I would be very interested in getting more information that would help me implement new persistence sources. I have read (there: http://www.mail-archive.com/user@lists.neo4j.org/msg6.html) that it should not be that difficult (or, at least, it is possible) but I still have some difficulties while navigating through the sources to understand exactly how it should be done. Besides, I have read that using MySQL was less efficient than Nioneo. Was the difference really important ? Best, Jawad ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] TransactionEventHandler and Spring transaction handling
Hi, I have not tried to reproduce this but just looking at the code I think it is a bug so thanks for reporting it! The synchronization hook that gathers the transaction data gets registered in the call to GraphDatabaseService.beginTx() but when using Spring (with that configuration) UserTransaction (old JTA) will be called directly so no events will be collected. Will fix this very soon. -Johan On Wed, May 19, 2010 at 5:49 PM, Antonis Lempesis ant...@di.uoa.gr wrote: Hello all, I have set up spring to handle transactions for neo4j (according to the imdb example) and it works fine. When I read about the new events framework, I checked out the latest revision (4421) and tried to register my TransactionEventHandler that simply prints the number of created nodes. The weird thing is that when I test this in a simple junit test case, the TransactionData I get contains the correct data. When I do the same thing using the spring configuration, the TransactionData is empty. Any ideas? Thanks, Antonis ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Batch inserter performance
I did some benchmarking SSD vs mechanical disk using the batch inserter injecting a dataset (social graph) that required ~10G to fit in RAM on a 4G machine. Preliminary results indicate that there is no difference. The SSD used has about the same sequential write speed as the mechanical disk and I think that is why (the batch inserter tries to read and write large chunks of data sequentially). (In normal non batch inserter mode the SSD is 50-100x faster on non cached reads). -Johan On Tue, May 18, 2010 at 7:48 PM, Lorin Halpert alyst...@gmail.com wrote: I'm curious how performance would differ/degrade when using SSDs instead of old standard HDDs after RAM is saturated. Anyone have numbers? On Tue, May 18, 2010 at 8:30 AM, Johan Svensson jo...@neotechnology.comwrote: Working with a 250M relationships graph you need better hardware (more RAM) to get good performance. The batch inserter tries to write as much as possible to memory then write sequentially to disk but since you have so little RAM it can not do that and will instead have to load data from disk and write out whenever needed. You may even get better performance not using batch inserter at all to insert the last 150M relationships. When there is enough RAM you should get more than 100k relationship inserts/s on standard server hardware using the batch inserter. -Johan On Tue, May 18, 2010 at 2:03 PM, Alex Averbuch alex.averb...@gmail.com wrote: Hi Johan, Thanks. At the moment I'm not using the property file but I'll start doing so next time I load a graph like this. The machine that's creating the database is actually my old one (my new one's power supply died a few days ago so I'm waiting on a new one to arrive), so I only have 2GB of RAM. Heap is set to 1.5GB at the moment. Given my configuration is the performance I described typical? Alex On Tue, May 18, 2010 at 1:50 PM, Johan Svensson jo...@neotechnology.com wrote: Alex, How large heap and what configuration setting do you use? To inject 250M random relationships at highest possible speed would require at least a 8GB heap with most of it assigned to the relationship store. See http://wiki.neo4j.org/content/Batch_Insert#How_to_configure_the_batch_inserter_properly for more information. -Johan On Tue, May 18, 2010 at 10:50 AM, Alex Averbuch alex.averb...@gmail.com wrote: Correction, the performance had degraded from ~3500 Relationships/Second to ~1500 Relationships/Second. Sloppy math... :) On Tue, May 18, 2010 at 10:46 AM, Alex Averbuch alex.averb...@gmail.com wrote: Hey, I'm loading a graph from a proprietary binary file format into Neo4j using the batch inserter. The graph (Twitter crawl results) has 2,500,000 Nodes 250,000,000 Relationships. Here's what I'm doing: (1) Insert all Nodes first. While doing so I also add 1 property (lets call is CUSTOM_ID) and index it with Lucene. (2) Call optimize() on the index (3) Insert all the Relationships. I use CUSTOM_ID to lookup the start end Nodes. Relationships have no properties. The problem is that the insertion performance seems to decay quite quickly as the size increases. I'm keeping track of how long it takes to insert the records. In the beginning it took about 5 minutes to insert 1,000,000 Relationships. After about 50,000,000 inserted Relationships it was close to 10 minutes to insert 1,000,000 Relationships. By the time I was up to 70,000,000 it was taking 12 minutes to insert 1,000,000 Relationships. That's a drop from ~7,000 Relationships/Second to ~3000 Relationships/Second and I'm worried that if this continues it could take over a week to load this dataset. Can you think of anything that I'm doing wrong? I have a neo.prop file but I'm not using it... I create the batch inserter with only 1 parameter (database directory). Cheers, Alex ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Batch inserter performance
Alex, How large heap and what configuration setting do you use? To inject 250M random relationships at highest possible speed would require at least a 8GB heap with most of it assigned to the relationship store. See http://wiki.neo4j.org/content/Batch_Insert#How_to_configure_the_batch_inserter_properly for more information. -Johan On Tue, May 18, 2010 at 10:50 AM, Alex Averbuch alex.averb...@gmail.com wrote: Correction, the performance had degraded from ~3500 Relationships/Second to ~1500 Relationships/Second. Sloppy math... :) On Tue, May 18, 2010 at 10:46 AM, Alex Averbuch alex.averb...@gmail.comwrote: Hey, I'm loading a graph from a proprietary binary file format into Neo4j using the batch inserter. The graph (Twitter crawl results) has 2,500,000 Nodes 250,000,000 Relationships. Here's what I'm doing: (1) Insert all Nodes first. While doing so I also add 1 property (lets call is CUSTOM_ID) and index it with Lucene. (2) Call optimize() on the index (3) Insert all the Relationships. I use CUSTOM_ID to lookup the start end Nodes. Relationships have no properties. The problem is that the insertion performance seems to decay quite quickly as the size increases. I'm keeping track of how long it takes to insert the records. In the beginning it took about 5 minutes to insert 1,000,000 Relationships. After about 50,000,000 inserted Relationships it was close to 10 minutes to insert 1,000,000 Relationships. By the time I was up to 70,000,000 it was taking 12 minutes to insert 1,000,000 Relationships. That's a drop from ~7,000 Relationships/Second to ~3000 Relationships/Second and I'm worried that if this continues it could take over a week to load this dataset. Can you think of anything that I'm doing wrong? I have a neo.prop file but I'm not using it... I create the batch inserter with only 1 parameter (database directory). Cheers, Alex ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Batch inserter performance
Working with a 250M relationships graph you need better hardware (more RAM) to get good performance. The batch inserter tries to write as much as possible to memory then write sequentially to disk but since you have so little RAM it can not do that and will instead have to load data from disk and write out whenever needed. You may even get better performance not using batch inserter at all to insert the last 150M relationships. When there is enough RAM you should get more than 100k relationship inserts/s on standard server hardware using the batch inserter. -Johan On Tue, May 18, 2010 at 2:03 PM, Alex Averbuch alex.averb...@gmail.com wrote: Hi Johan, Thanks. At the moment I'm not using the property file but I'll start doing so next time I load a graph like this. The machine that's creating the database is actually my old one (my new one's power supply died a few days ago so I'm waiting on a new one to arrive), so I only have 2GB of RAM. Heap is set to 1.5GB at the moment. Given my configuration is the performance I described typical? Alex On Tue, May 18, 2010 at 1:50 PM, Johan Svensson jo...@neotechnology.comwrote: Alex, How large heap and what configuration setting do you use? To inject 250M random relationships at highest possible speed would require at least a 8GB heap with most of it assigned to the relationship store. See http://wiki.neo4j.org/content/Batch_Insert#How_to_configure_the_batch_inserter_properly for more information. -Johan On Tue, May 18, 2010 at 10:50 AM, Alex Averbuch alex.averb...@gmail.com wrote: Correction, the performance had degraded from ~3500 Relationships/Second to ~1500 Relationships/Second. Sloppy math... :) On Tue, May 18, 2010 at 10:46 AM, Alex Averbuch alex.averb...@gmail.com wrote: Hey, I'm loading a graph from a proprietary binary file format into Neo4j using the batch inserter. The graph (Twitter crawl results) has 2,500,000 Nodes 250,000,000 Relationships. Here's what I'm doing: (1) Insert all Nodes first. While doing so I also add 1 property (lets call is CUSTOM_ID) and index it with Lucene. (2) Call optimize() on the index (3) Insert all the Relationships. I use CUSTOM_ID to lookup the start end Nodes. Relationships have no properties. The problem is that the insertion performance seems to decay quite quickly as the size increases. I'm keeping track of how long it takes to insert the records. In the beginning it took about 5 minutes to insert 1,000,000 Relationships. After about 50,000,000 inserted Relationships it was close to 10 minutes to insert 1,000,000 Relationships. By the time I was up to 70,000,000 it was taking 12 minutes to insert 1,000,000 Relationships. That's a drop from ~7,000 Relationships/Second to ~3000 Relationships/Second and I'm worried that if this continues it could take over a week to load this dataset. Can you think of anything that I'm doing wrong? I have a neo.prop file but I'm not using it... I create the batch inserter with only 1 parameter (database directory). Cheers, Alex ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Traversal Speed is just 1 millisecond per node
Hi, I understand the graph layout as this: (publisher node)--PUBLISHED_BY---(book)--BORROWED_BY--(student) There are other relationships between books and students (RETURNED_BY,RESERVED_BY) but the relationship count on a book node will still be low compared to the publisher node. Correct? On a machine with 8GB RAM and that graph calculating amount of borrowed books for a publisher or get amount of books never borrowed for a publisher should execute in under 100ms given your traversal speed is 300+ hops/ms (in 30ms with 1000 hops/ms). Normal cached traversal speed on any modern hardware should be 1000+ hops/ms. The poor performance when performing the depth two traversal to get the amount of borrowed relationships on a publisher is indeed very strange. A possible reason could be that the traverser is traversing more than it should. For example if you create a traverser that traverses the PUBLISHED_BY and BORROWED_BY relationships in both directions you would traverse (publisher)-(book)-(student)-(other book)-(other or same publisher)-(other book)... and so on (and that would take time). Also make sure you are not measuring number of borrowed books traversal speed over all publishers instead of a single publisher (retrieving borrowed books from all publishers with a low borrowed ratio would result in a low borrowed books/ms value). Could you write a simple traverser using the core API that calculates the relationship count at depth two given a publisher node. Something like: Node publisher = // get a publisher node int count = 0; for ( Relationship bookRel : publisher.getRelationships( RelTypes.PUBLISHED_BY ) ) { count++; Node book = bookRel.getOtherNode( publisher ); for ( Relationship studentRel : book.getRelationships( RelTypes.BORROWED_BY ) ) { count++; } } Avg count and time to perform that operation wold be interesting. If timings are not in expected range (basically 15k [avg number of rels/publisher] x 2 [the borrowed_by rel] x traversal speed) try get the total relationship count by replacing getRelationships(type) with getRelationships() to get worst case scenario on how much traversing needs to be done. Regards, -Johan On Sun, May 16, 2010 at 4:08 AM, suryadev vasudev suryadev.vasu...@gmail.com wrote: Here is a rough design and volume ... When I first read about traversal speed of 1000-3000/millisecond, I added some buffer and assumed 500/millisecond as a realistic speed. I am not giving up so easily after seeing 1/millisecond. I look forward to responses from other users. The real challenges will be around queries for a publisher. A publisher will have around 15,000 books and a query like Given a published ID, what percentage of his books were never borrowed will need full browsing. My hope was that I could browse through and get the answer in 30 milliseconds. But it looks like it will take a minimum of 15 seconds. Some publishers will have 50,000 books and I can't imagine a response time of 50 seconds. So, I have to achieve at least 500/millisecond if not the original 1000. Regards SDev ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] triples store loading
Hi, Adding a literal (with average size around ~400 bytes if the numbers are correct) should not result in such a big difference of injection times. Could you give some more information regarding setup so we can track down the cause of this. Good things to know would be: o java version and jvm switches/configuration (heap size, gc settings etc) o amount of RAM on the machine o neo4j kernel version (1.1-SNAPSHOT or 1.0?) o configuration passed into the neo4j kernel if any o size of the different store files in the neo4j db directory (after injection of data) Regards, Johan On Fri, May 7, 2010 at 10:26 AM, Mattias Persson matt...@neotechnology.com wrote: I'm not sure why it's taking such a long time, it shouldn't be that slow. I'll see if I can get time to look into that... but I don't know when exactly. Have you ran it through a profiler yourself? I think that's a good idea since I'm a little locked up right now :) Maybe the answer is obvious once I can look at such an output from a profiler! 2010/5/6 Lyudmila L. Balakireva lu...@lanl.gov Hi, I am testing the neo4j with dbpedia data. When I am loading triples in the form (URI,URI,URI) the speed is good ( 10 mln in 2 min). The loading URI,URI,LITERAL is very slow. The Literal is mainly chunks of text. For example shortabstract_en.nt with 2943434 records took 6.4 hrs to load. The file size is 1194900107. Is it possible to optimize the loading of literals or it is better to store abstracts somewhere else and keep pointer to some verbose text in neo4j? What is casing the dramatic difference in loading of the verbose literals? Thanks, Luda ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] IndexBatchInserter can't read an index
On Thu, Apr 29, 2010 at 7:48 PM, Jon Noronha cheesel...@gmail.com wrote: ... Is this a feature or a bug? All of the examples suggest that it's possible to read from the LuceneIndexBatchInserter, and indeed if I combine the code that creates the nodes and the index with the code that reads it into one file everything works perfectly. It's only in separating the files that this problem occurs. It is a bug and we just committed a fix for it in trunk. Regards, Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Fwd: Neo4J and Concurrency
Hi, This code will shutdown the kernel right away. Depending on timing you may shutdown the kernel while the thread pool is still executing and that could be the cause of your error. If you remove the @After / kernel shutdown code or add code in the @Test method to wait for the thread pool to execute all tasks will it work then? Regards, Johan On Mon, Apr 26, 2010 at 2:05 PM, Stefan Berndt kont...@stberndt.de wrote: Off course i can. This is my testcase-classfile: import org.junit.After; import org.junit.Before; import org.junit.Test; import org.neo4j.graphdb.Transaction; import org.neo4j.kernel.impl.transaction.DeadlockDetectedException; import java.io.IOException; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; public class WriteOneNode { private DatabaseAccessor db; private static long NodeId; �...@before public void setUp() throws IOException { db = SingletonFactory.getInstance(); createNodes(); } �...@after public void tearDown() { db.shutdown(); } private void createNodes() { Transaction tx = db.beginTx(); try { Kunde k = new KundeImpl(db.createNode()); k.setName(ol); NodeId = k.getunderlyingNode().getId(); tx.success(); } finally { tx.finish(); } } �...@test public void writeOneNodeConcurrent() { ExecutorService pool = Executors.newFixedThreadPool(1); for (int i = 0; i 10; i++) { final int j = i; pool.execute(new Runnable() { public void run() { Transaction tx = db.beginTx(); try { writeNode(j); } catch (DeadlockDetectedException e) { System.err.println(e.getMessage()); } finally { tx.finish(); } } }); } } private void writeNode(int j) { if (j % 2 == 1) { db.getNodeById(NodeId).setProperty(KundeImpl.KEY_NAME, n1); } else db.getNodeById(NodeId).setProperty(KundeImpl.KEY_NAME, n2); } } Thx for your help -- Stefan We should fix this as soon as we can, could you provide a (small) test case that can reproduce this with some reliability? /Tobias Original Message Subject: Neo4J and Concurrency Date: Mon, 26 Apr 2010 09:10:18 + From: Stefan Berndt kont...@stberndt.de To: user@lists.neo4j.org Hello, I am testing Neo4j for a week now and i'm trying to make some operations on the Graph concurrent. For this I use the PoolExecutor and do some Write-Operations. but the TransactionManager just throws Exceptions i don't understand. F.ex.: javax.transaction.xa.XAException: Unknown xid[GlobalId[NEOKERNL|1272272945279|2], BranchId[ 52 49 52 49 52 49 ]] at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.rollback(XaResourceManager.java:416) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.rollback(XaResourceHelpImpl.java:111) at org.neo4j.kernel.impl.transaction.TransactionImpl.doRollback(TransactionImpl.java:531) at org.neo4j.kernel.impl.transaction.TxManager.rollback(TxManager.java:728) at org.neo4j.kernel.impl.transaction.TransactionImpl.rollback(TransactionImpl.java:114) at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:336) at ConcurrentTest$1.run(ConcurrentTest.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) 26.04.2010 11:09:05 org.neo4j.kernel.impl.transaction.TxManager rollback SCHWERWIEGEND: Unable to rollback marked or active transaction. Some resources may be commited others not. Neo4j kernel should be SHUTDOWN for resource maintance and transaction recovery Exception in thread pool-1-thread-1 org.neo4j.kernel.impl.transaction.TransactionFailureException: Unable to rollback transaction at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:349) at ConcurrentTest$1.run(ConcurrentTest.java:62) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: javax.transaction.SystemException: Unable to rollback --- error code for rollback: 0 at org.neo4j.kernel.impl.transaction.TxManager.rollback(TxManager.java:738) at
Re: [Neo] force preloading into memory
On Tue, Apr 20, 2010 at 10:42 AM, Erik Ask erik...@maths.lth.se wrote: Tobias Ivarsson wrote: The speedup you are seeing is because of caching. Items that are used are loaded into an in-memory structure, that does not need to go through any filesystem API, memory-mapped or not. The best way to load things into cache is to run the query once to touch everything that needs to be loaded. Pre-adapting the memory-maps as you suggest would give some speedup to the actual process of the first query, but that time would be spent in startup instead, meaning that the time from cold start to completed first query would be exactly the same. Cheers, Tobias On Mon, Apr 19, 2010 at 6:31 PM, Erik Ask ask.e...@gmail.com wrote: Hello I'm getting really slow performance when working against the HD. A given set of queries can take up to 10 minutes when performed the first time. Repeating the same set of queries a second time is executed in seconds (2-5). As far as I can tell from watching in jconsole, the heap behaves in almost the exact same maner (slowly rising slope) for both transactions (each set of queries has it own transaction) so it seems the speedup is due to memory mapping. I've tinkered with the settings, but is there a way of explicitly forcing the IO mapper to preload all or part of the node store and relationship store? Am I right to assume that initially nothing is IO mapped and these buffers builds up during runtime as requests are made? Is there any way of tuning access to the HD? greetz Then i don't understand the purpose of loading files in to memory. I thought it was used to make a copy of as much of a file as possible into memory, then do all subsequent lookups there, and if needed replace parts if nonloaded parts of the file are more frequently requested than loaded. This would result in one hd-read per node/rel (assuming it fit into memory and no replacing was needed), as opposed to searching for entries in file that would require lots of reads and comparisons. The amount of data that needs to be loaded into memory just doesn't seem to warrant that much time being spent. I could easily copy files several times the size of my complete DB in less time than it takes to run my query sets. Hi, Tobias is right about the caching part but there are issues with memory mapped I/O in play here too. If you turn off memory mapped I/O and use normal buffers (use_memory_mapped_buffers=false) you will probably see a speedup in initial query time. This is because using memory mapped I/O will result in lots of seeks since most OS/configurations implement them in such a (maybe not ideal) way. Non memory mapped buffers will do sequential reads to fill the entire buffer and (depending on how much of the graph the first search touches) it will likely be faster. To explain further, if you request to map a region of a file into memory it will do so and return almost instantly. The contents of the file is however not loaded into memory, instead it will do lazy loads when you start to read bytes from the buffer resulting in more random I/O and seeks. This in turn results in slow searches and long warmup time on mechanical disks. (Note, behavior I described here may vary depending on OS, JVM implementation and so on.) You are right about the purpose of loading regions of files into memory so we don't have to do lookups on disk (and dynamically change those regions depending on access patterns). The problem is that initial access patterns when nothing has been loaded yet will look random. Then to further kill performance memory mapped regions will not do a sequential read of the data (this is very bad in your scenario but is better when the server is warm). A work around for this is to pre-fill the OS file-system caches before you start searching. Write a script that sequentially reads the node, relationship (and property store file if your searches access properties). That will cause the memory mapped regions to map against the file-system cache and then the contents of the file will already be in memory. Regards, Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Exception when adding and deleting in a single transaction
Hi, There should be no problem to do multiple modifying operations in the same transaction. Since you are talking about statements I take it you are using the rdf component? What happens if you move the delete statement before the call to Tx.success()? Regards, Johan On Wed, Apr 14, 2010 at 5:59 PM, Subbacharya, Madhu madhu.subbacha...@corp.aol.com wrote: Hi, When I add a statement and then delete a different statement within a single transaction begin/end block, Tomcat is complaining about a dangling thread. Here is what I am doing: 1. Tx begin 2. Add a statement 3. If add succeeded, Tx.success() 4. Delete a statement 5. If delete failed, Tx.failure() 6. In the finally block of the try statement, I call Tx.finish(), shutdown the store, etc. I have a workaround by creating 2 separate transactions (one for the add and the other for the delete) and managing them independently with associated status codes, which works fine. However, what I would really like is to be able to do multiple DB modifying operations within a single transaction block. Assuming this is doable, would be nice to have a Tx.state() method that returns whether Tx.success() or Tx.failure() was last called. Thanks madhu Tomcat log: Apr 13, 2010 9:51:27 PM org.apache.catalina.loader.WebappClassLoader clearThreadLocalMap SEVERE: A web application created a ThreadLocal with key of type [null] (value [null]) and a value of type [org.neo4j.index.Isolation] (value [SAME_TX]) but failed to remove it when the web application was stopped. To prevent a memory leak, the ThreadLocal has been forcibly removed. Apr 13, 2010 9:51:27 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreadLocals WARNING: Failed to clear ThreadLocal references java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja va:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso rImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.loader.WebappClassLoader.clearThreadLocalMap(Webapp ClassLoader.java:2102) at org.apache.catalina.loader.WebappClassLoader.clearReferencesThreadLocal s(WebappClassLoader.java:2027) at org.apache.catalina.loader.WebappClassLoader.clearReferences(WebappClas sLoader.java:1710) at org.apache.catalina.loader.WebappClassLoader.stop(WebappClassLoader.jav a:1622) at org.apache.catalina.loader.WebappLoader.stop(WebappLoader.java:710) at org.apache.catalina.core.StandardContext.stop(StandardContext.java:4649 ) at org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:9 24) at org.apache.catalina.startup.HostConfig.checkResources(HostConfig.java:1 174) at org.apache.catalina.startup.HostConfig.check(HostConfig.java:1342) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:3 03) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleS upport.java:119) at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase. java:1337) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.pro cessChildren(ContainerBase.java:1601) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.pro cessChildren(ContainerBase.java:1610) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run (ContainerBase.java:1590) at java.lang.Thread.run(Thread.java:637) Caused by: java.lang.NullPointerException at java.lang.ThreadLocal.access$400(ThreadLocal.java:53) at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:436) ... 20 more ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Unable to memory map
Hi, The read only version is not faster on reads compared to a writable store. Internally the only difference is we open files in read only mode. The reason you get the error is that your OS does not support to place a memory mapped region to a file (opened in read only mode) when the region maps outside the file data (in write mode the file will grow in size when that happens). -Johan On Mon, Mar 29, 2010 at 9:03 PM, Marc Preddie mpred...@gmail.com wrote: Hi, I've had some time to look into this issue and it seems that when using the ReadOnly versions of the classes, I get the memory mapping warnings and when using the Writable versions of the classes, the warning does not occur (I'm assuming memory mapping gets enabled). I'm not against using the writable versions of the classes; my only concern is performance. Are the readonly versions faster that the writable versions? And if they are; then if memory mapping is not enabled, are they faster that the writable versions with memory mapping? I'll run some tests, but I guess I would like an expert opinion. Regards, Marc On Mon, Mar 22, 2010 at 10:24 AM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Hi, We have seen this message before emitted as a warning from Neo4j. Are you seing this as a warning as well, or are you getting an exception thrown to your application code? It's hard to deal with these errors since nio only throws IOException, and not any more semantic information than that, I believe we deal with all cases by issuing a warning and then falling back to another method of performing the same operation, but if you are getting exceptions we need to resolve it. If you are indeed getting exceptions, some code that triggers it would be very helpful. Cheers, Tobias On Wed, Mar 17, 2010 at 1:47 PM, Marc Preddie mpred...@gmail.com wrote: Hi, I've look at the mailing list and found 1 similar situation, but no real solution. So I was hoping someone could shed some light on this. I seem to have an issue with neo4j being able to use memory mapped files. I've run my service on Win XP 64bit, Mac OSX Snow Leopard 10.6.2 and Centos 5.x 64bit and always get the same error when launching. I'm using APOC 1.0 and have a DB of approx 600M. In my neo config I allocate about 5M more for each type of file than the actual file size (I've tried multiple different settings). On each machine I also leave at least 1.5G for the OS and have at least 2.5G heap for the Java process. I'm also using the classes EmbeddedReadOnlyGraphDatabase and LuceneReadOnlyIndexService to access and browse DB. Neo config neostore.nodestore.db.mapped_memory=10M neostore.relationshipstore.db.mapped_memory=110M neostore.propertystore.db.mapped_memory=85M neostore.propertystore.db.index.mapped_memory=10M neostore.propertystore.db.index.keys.mapped_memory=10M neostore.propertystore.db.strings.mapped_memory=320M neostore.propertystore.db.arrays.mapped_memory=10M Here is the error org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map pos=3005872 recordSize=33 totalSize=1153416 at org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:59) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:530) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:430) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:122) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:459) at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:248) at org.neo4j.kernel.impl.nioneo.xa.NeoReadTransaction.getMoreRelationships(NeoReadTransaction.java:103) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.getMoreRelationships(NioNeoDbPersistenceSource.java:275) at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:93) at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:585) at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:332) at org.neo4j.kernel.impl.core.NodeImpl.ensureFullRelationships(NodeImpl.java:320) at org.neo4j.kernel.impl.core.NodeImpl.getAllRelationshipsOfType(NodeImpl.java:129) at org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:179) at org.neo4j.kernel.impl.core.NodeProxy.getSingleRelationship(NodeProxy.java:98) Caused by: java.io.IOException: Access is denied at sun.nio.ch.FileChannelImpl.truncate0(Native Method) at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:728) at
Re: [Neo] getNumberOfIdsInUse(Node.class)) return -1
Hi, I had a look at this and can not figure out why -1 is returned. When running the kernel in normal (write) mode the return value of number of ids in use will only be correct if all previous shutdowns have executed cleanly. This is an optimization to reduce the time spent in recovery rebuilding id generators after a crash/non clean shutdown. After a crash/non clean shutdown the number of ids in use will always be the highest id in use + 1. To force a full rebuild of the id generators on each startup (on a non clean shutdown) pass in the following configuration: rebuild_idgenerators_fast=false In read only mode the return value will always be the highest id in use + 1. You could try to delete the neostore.nodestore.db.id and pass in rebuild_idgenerators_fast=false as configuration when starting up (this will take a long time if the node store file is large). If you still get incorrect results send me a compressed version of the neostore.nodestore.db.id file and I will have a look at it. Regards, -Johan On Tue, Apr 6, 2010 at 3:38 PM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Sorry, we have not had time to look into that yet. I'll let you know when we have. On Mon, Apr 5, 2010 at 12:31 PM, Laurent Laborde kerdez...@gmail.comwrote: Any news ? -- Ker2x On Fri, Mar 26, 2010 at 12:05 PM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Ok, thanks. We'll look into it. On Fri, Mar 26, 2010 at 11:49 AM, Laurent Laborde kerdez...@gmail.com wrote: something between 100 millions and 1 billions, i guess. the DB contain the result of my collatz code from 1 to 100 millions. -- Ker2x On Fri, Mar 26, 2010 at 11:40 AM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: If you have a large number of nodes it could be a truncation error from long to int somewhere, how many nodes to you estimate that you have? It is a bug so we will fix it, but if we know the approximate estimated size it would help in finding the cause. /Tobias On Fri, Mar 26, 2010 at 7:59 AM, Laurent Laborde kerdez...@gmail.com wrote: my code do a : System.out.println(Number of nodes : + neo.getConfig().getNeoModule().getNodeManager().getNumberOfIdsInUse(Node.class)); it print : Number of nodes : -1 why does it print -1 ? how can i count node ? thank you :) ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] BatchInserter.java fails
Hi, Can you reproduce this in a test case and send me the code? If not I would need the active logical log that rotation fails at (name of file is either called nioneo_logical.log.1 or nioneo_logical.log.2). Regards, -Johan On Fri, Feb 26, 2010 at 5:27 PM, Lyudmila L. Balakireva lu...@lanl.gov wrote: I am using neo4j-kernel version1.1-SNAPSHOT/version the full stack trace: java.lang.RuntimeException: org.neo4j.kernel.impl.transaction.TransactionFailureException: Unable to write command to logical log. ... Caused by: org.neo4j.kernel.impl.transaction.TransactionFailureException: Unable to write command to logical log. at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.addCommand(XaTransaction.java:243) at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.doPrepare(WriteTransaction.java:200) at ... org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:462) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:571) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:543) at org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.checkBatchCommit(GraphDatabaseSailConnectionImpl.java:821) ... 19 more Caused by: java.io.IOException: Log rotation failed, unknown log entry[-78] at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.rotate(XaLogicalLog.java:1253) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.checkLogRotation(XaLogicalLog.java:596) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.writeCommand(XaLogicalLog.java:558) at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.addCommand(XaTransaction.java:239) ... 27 more ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j Traverse API
On Fri, Feb 26, 2010 at 2:27 AM, Satish Varma Dandu dsva...@gmail.com wrote: Hi John/nishith/Ulf, Thanks guys for all your replies. John, I was thinking about the same thing that you suggested. I havent yet constructed a huge n/w, but i was just curious how long will it take to traverse for 100K nodes or 1M nodes comparing each nodes value Roughly you can visit about 1M nodes/s if you only traverse over relationships and don't do anything else. Regarding comparing the value that will depend on what is being compared. Loading a string property containing the profile text and then string search in that string will be very slow compared to the time it takes to go from one node to the next in the traversal. -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j Traverse API
Hi, Would it be possible to just traverse the friends looking for the neo4j text on their profile? Like this: Node user = // get the user; Traverser trav = user.traverse( Order.BREADTH_FIRST, new StopEvStopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.FRIEND, Direction.BOTH ); for ( Node node : trav ) { int currentDepth = trav.currentPosition().depth(); if ( currentDepth 2 ) { // we only check second level friends break; } if ( userHasTextOnProfile( node, neo4j ) ) { addNodeToResult( node, currentDepth ); } } If that is too slow you could combine a lucene search (for fulltext search see http://wiki.neo4j.org/content/Indexing_with_IndexService#Fulltext_indexing) and a traversal like this: SetNode nodeSet = new HashSetNode(); for ( Node node : index.getNodes( profile_text, neo4j ) ) { nodeSet.add( node ); } Node user = // get the user; Traverser trav = user.traverse( Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.FRIEND, Direction.BOTH ); for ( Node node : trav ) { int currentDepth = trav.currentPosition().depth(); if ( currentDepth 2 ) { // we only check second level friends break; } if ( nodeSet.contains( node ) ) { addNodeToResult( node, currentDepth ); } } If there are very few hits on the lucene search it is more efficient to do a shortest path search using the neo4j-graph-algo component (0.4-SNAPSHOT): Node user = // get the user; // search using lucene and for each node do a shortest path lookup to user for ( Node node : index.getNodes( profile_text, neo4j ) ) { ListRelationship path = new FindSingleShortestPath( user, node, RelTypes.FRIEND, 2 ).getPathAsRelationships(); int depth = path.size(); if ( depth 0 ) { addNodeToResult( node, depth ); } } Regards, -Johan On Thu, Feb 25, 2010 at 8:32 AM, Nishith Shah nish...@truesparrow.com wrote: Hi Satish, Can you assign the keyword that you intend to search as the property of the node? For example, assign 'neo4j' as a property of the node. Of course, it won't be possible if it's a free form search that you intend to do. But if you can, than traversing and sorting would be so much easier. -nishith On Thu, Feb 25, 2010 at 12:56 PM, k...@oocs.de k...@oocs.de wrote: Hi Satish, if I understand you correctly, you could do the traversal in a breath first fashion with ... node.traverse(Order.BREATH_FIRST, ... You'll get the first degree Nodes before the second dgree nodes and so forth. Regards, Ulf Satish Varma Dandu dsva...@gmail.com hat am 24. Februar 2010 um 19:37 geschrieben: Hi John, Thanks for the reply. Consider a scenario like LinkedIn: 1) I wanna search for all profiles in linkedin matching Neo4J 2) Now i get, lets say 20 people having Neo4J on their profiles. So far so good. But i wanna order these search results based on my order. Like first i wanna search results from my direct contacts followed by next order results. The worst case scenario is, once i get these search results, for each search result profile, i need to traverse find the path. But this take a lot of time if i get 2 many search results. So somehow i wanna combine both Lucene traverse. Is this doable with Neo4J? Hope i explained the problem. Any help would be great. Thanks, -Satish ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] deleting bd has grown
Hi, Yes those files holds ids that can be reused. If you startup again and create 310k nodes the id file will shrink and the db file will not change in size. Regards, -Johan 2010/2/25 Miguel Ángel Águila magu...@ac.upc.edu: Helo, I'm doing a delete operation, in this case I'm deleting 310.000 nodes from 325.000 nodes. I can delete and after that all works well because the nodes when I get them only appears the no deleted nodes, but the size of the folder that contains neo4j database has grown. Specifically has grown the following files: - Folder lucene. - neostore.nodestore.db.id - neostore.propertystore.db.id - neostore.propertystore.db.strings.id - neostore.relationshipstore.db.id I expect, you can help me. Thank you. Mike ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] IllegalStateException meaning
If a commit fails (after prepare) with an exception and the status of the transaction got set to STATUS_COMMITTED recovery has to be performed (TM will not accept any calls until that transactions has been recovered properly). If the status was not updated to committed the TM will try to rollback the transaction. If rollback succeeds a HeuristicRollbackException will be thrown and the transaction removed. If it fails recovery has to be performed (and again TM will not accept anything until that transaction has been recovered). -Johan On Wed, Feb 24, 2010 at 3:06 PM, Adam Rabung adamrab...@gmail.com wrote: Chalk my complaint up to user error. My code was something like this: try { //import a ton of data Transaction.commit(); finally { Transaction.ensureTransactionClosed(); } commit() was failing w/ an OutOfMemoryError out of the Neo tx finish(), but I never saw this exception: before the exception is logged, the finally block was executed, which called finish() on the same Neo transaction. This second call to finish also throws an exception, saying Tx status is: STATUS_COMMITING, which now makes sense to me. Because the second exception was thrown out of my finally block, I lost the original exception. Nasty! I've changed ensureTransactionClosed() to trap exceptions, which should help. I wonder if a call to Transaction Impl#finish() results in an exception, subsequent calls should be a no-op? Adam On Wed, Feb 24, 2010 at 4:14 AM, Johan Svensson jo...@neotechnology.com wrote: Hi, Yes we are working on monitoring tools. Since transactions are held in memory until committed larger transactions (containing many write operations) will consume more memory. It would be possible to not keep the full transaction in memory but that would kill read performance in that transaction since for every read we would have to check disk if the value being read has a local modification. Adam, do you have the full stacktraces/log messages from the IllegalStateException? Regards, -Johan On Tue, Feb 23, 2010 at 9:49 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: I think it would be valuable to understand why the memory requirements are so large and how best to manage these types of situations in addition to increasing the heap, since it seems that in some cases this merely delays the issue. Is there any internal instrumentation on Neo memory usage that could be used to help tune/tweak the settings? If not, would it make sense to add a couple of MBeans for this type of information? Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Adam Rabung Sent: Tuesday, February 23, 2010 2:15 PM To: Neo user discussions Subject: Re: [Neo] IllegalStateException meaning I just got this same problem, and was able to defeat by upping heap size. It was very strange - does Transaction#finish do some non-blocking work? Disclaimer: I'm using trunk neo-kernel from 2/10. Thanks, Adam ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] IllegalStateException meaning
Hi, Yes we are working on monitoring tools. Since transactions are held in memory until committed larger transactions (containing many write operations) will consume more memory. It would be possible to not keep the full transaction in memory but that would kill read performance in that transaction since for every read we would have to check disk if the value being read has a local modification. Adam, do you have the full stacktraces/log messages from the IllegalStateException? Regards, -Johan On Tue, Feb 23, 2010 at 9:49 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: I think it would be valuable to understand why the memory requirements are so large and how best to manage these types of situations in addition to increasing the heap, since it seems that in some cases this merely delays the issue. Is there any internal instrumentation on Neo memory usage that could be used to help tune/tweak the settings? If not, would it make sense to add a couple of MBeans for this type of information? Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Adam Rabung Sent: Tuesday, February 23, 2010 2:15 PM To: Neo user discussions Subject: Re: [Neo] IllegalStateException meaning I just got this same problem, and was able to defeat by upping heap size. It was very strange - does Transaction#finish do some non-blocking work? Disclaimer: I'm using trunk neo-kernel from 2/10. Thanks, Adam ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j Traverse API
Hi, You can not make such an ordered search using the Lucene indexing service. You could try to only use a traverser instead of a Lucene search and let the traverser do the filtering. I am not sure I understand your problem completely. If you could describe the problem in more detail I am sure we can come up with a good solution for it. Regards, -Johan On Tue, Feb 23, 2010 at 11:00 PM, Satish Varma Dandu dsva...@gmail.com wrote: Hi, I am new to Neo4J, and so far it looks really good for traversing nodes. I have a question on using Traverser API. Can we order the lucene search results by degree wise. When i search for some data using lucene, i will get some nodes. Now i want to arrange those search results nodes in the order of level.i.e first i want results from my direct nodes then next level of nodes etc. Is this supported out of the box? Appreciate if you could point me to correct resource. Thanks Regards, -Satish ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Obtaining cache statistics
Hi, I think the easiest way would be for you to instrument the code in org.neo4j.kernel.impl.core.NodeManager (check for nodeCache.get and relCache.get calls). Another way would be to let the application run the same traversal/query many times in a row with different heap sizes. If later iterations of the same traversal are not converging towards the same time (faster than a cold run) there may be a problem with the JVM settings and/or kernel configuration. Regards, -Johan On Thu, Feb 18, 2010 at 5:37 PM, Georg M. Sorst georgso...@gmx.de wrote: Hey list, what would be the best way to obtain cache statistics from Neo4j, stuff like hit / miss ratio for the node / relationship cache etc. I guess I could always try profiling and check for the relevant methods but is there a better way? More generally speaking, if your application is slow and you suspect the cache might have something to do with it how would you proceed? While there is some info about configuring the cache and some rough guidelines on the recommended sizes I could not find a way to get hard numbers for tuning. Thanks and best regards, Georg ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] filtering content based on tags in multiple levels
Hi, Another way would be to use the graph matching component. Have a look at this thread: http://lists.neo4j.org/pipermail/user/2010-February/002722.html Regards, -Johan On Thu, Feb 18, 2010 at 8:49 AM, Raul Raja Martinez raulr...@gmail.com wrote: Hi Sumanth, You can have all your questions and answers be nodes that are connected through relationships that are your tags. For any kind of filtering you can have Traversers with returnable evaluators that evaluate the kind of results you want back from you graph structure. So the basic answer is yes you can do that with neo4j and I believe in a easier, faster and more natural way that you would otherwise do in a relational database. 2010/2/17 Sumanth Thikka suma...@truesparrow.com Hi, Consider the following scenario: We have some questions and some tag(s) associated with each question. We have tags(lets say A, B, C, D, E etc) associated with some questions(just like in stackoverflow http://stackoverflow.com/). We should be able to filter all questions we have based on a tag(lets say it is A). Once we have all questions having tag A, other tags associated with this set of questions needs to be listed. Selecting(filtering) a tag(say B) from this second level tags should filter questions having the tags A and B. The filtered questions have tags A, B and some other possibly. The same above scenario should be served to any level of tags. Is it possible to achieve by using neo4j? If yes, how can we achieve this? ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j failing transactions after a day of tiny load
Are you using the old cache implementation or the new one based on soft references? If you have configured Neo4j kernel with use_old_cache=true you are using the old one. I think you would get best performance using the default soft reference based LRU cache just passing in -server -XX:+UseConcMarkSweepGC together with heap size to the JVM. To clear the caches invoke the following on the GraphDatabaseService: ((EmbeddedGraphDatabase) graphDb).getConfig().getNeoModule(). getNodeManager().clearCache(); -Johan On Wed, Feb 17, 2010 at 12:15 PM, Dmitri Livotov dmi...@livotov.eu wrote: This one it worked up to 2.2 days till the same crash. The heap setting in neo4j was set to 0.84 and soon after the tests start, free memory lowered to 100M (relates exactly to 0.84 value)and was sitting at that level entire time, sometimes falling to 50-70 M and raising up to 600-700M. So I suppose, the crash did happen during one of such heap fallings, where some other subsystem requested a bit more heap. Only strange, it did not write OOM exception to the logs but this could be because the OOM happened inside app server transaction manager, for instance and was not reported to logs. We'll run it again, setting the heap value to 0.4 - 0,5 of entire heap available. But need to say, that during the testing, even in low memory conditions, response time from test calls to the database was qute fast. Could you please describe more detail on non public API for cache reset, as this still could be useful in some situations, so we'd like to have such reset cache button later in our admin interace. It would be nice to read somewhere a short summary on all undocumented/hidden API calls, as we'd like to create UI and ability to fine tune for as much db corners as possible from our system. (Being so tired on inability to control SQL databases well) Best, Dmitri Johan Svensson wrote: On Mon, Feb 15, 2010 at 7:20 PM, Dmitri Livotov dmi...@livotov.eu wrote: By the way, is there any global L2 cache , based on heap in neo4j and can it be emptied manually ? We run another test with increased heap of 2G and 1G for memory mapped files. After 4 hours of testing, JVM reports from 100 to 200 M of available memory, so past crashes probably was caused by the heap issues. However, neo4j is still running, showing not so bad performace, so we'll keep it running till tomorrow. Did the tests run ok this time? There is a LRU cache for nodes and relationships. See http://wiki.neo4j.org/content/Configuration_Settings#Cache_settings and http://wiki.neo4j.org/content/Neo4j_Performance_Guide for more information. It is possible to clear the cache using non standard API call on the GraphDatabaseService but I would not recommend doing that since it will cause all requests to start hitting the file system (and start building the LRU cache from scratch). Regards, -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Popularity sorting
Chris, Sorry for long response time. The pattern matcher can only match a single exact depth/pattern at the moment. There is (currently) no way to configure it to match the way you describe. Instead you would have to generate a pattern for each depth meaning: product-customer-product product-customer-product-customer-product product-customer-product-customer-product-customer-product and so on. -Johan On Wed, Feb 10, 2010 at 5:10 PM, Chris Owen co...@thoughtworks.com wrote: Thanks for all the response, Johan the pattern matcher looks interesting. Following your example I see the results expected for matching the pattern product-customer-product from the selected node. This is what we want, though I wonder how we could repeat this match until a given depth is found i.e. product-customer-product-customer-product At first I thought that this would be the case and it was only the PropertyEqualConstraint on the propertyX pattern that was ensuring the the first node in the pattern had to be named a given value. This didn't seem to be the case though, as removing the constraint did nothing to the result set. Hope this makes sense, I'm still getting my head around all this. Chris ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j failing transactions after a day of tiny load
Ah now I understand. All the cache settings you are using in that configuration are for the old cache and will have no effect on the default soft reference based one. To tweak the soft reference based one you need to tweak the JVM, here is an introduction: http://jeremymanson.blogspot.com/2009/07/how-hotspot-decides-to-clear_07.html We need to make that clearer in the documentation that those cache settings only apply if use_old_cache=true is set. -Johan On Wed, Feb 17, 2010 at 12:37 PM, Dmitri Livotov dmi...@livotov.eu wrote: Yes, we do have both options enabled: -server -XX:+UseConcMarkSweepGC as well as no old_cache parameter. Im attaching our config below: neostore.nodestore.db.mapped_memory=270M neostore.relationshipstore.db.mapped_memory=385M neostore.propertystore.db.mapped_memory=110M neostore.propertystore.db.index.mapped_memory=2M neostore.propertystore.db.index.keys.mapped_memory=2M neostore.propertystore.db.strings.mapped_memory=130M neostore.propertystore.db.arrays.mapped_memory=100M use_adaptive_cache=YES adaptive_cache_heap_ratio=0.85 adaptive_cache_manager_decrease_ratio=1.15 adaptive_cache_manager_increase_ratio=1.1 adaptive_cache_worker_sleep_time=3000 min_node_cache_size=0 cache is in use relationship cache will not be decreased under this value min_relationship_cache_size=0 max_node_cache_size=1500 max_relationship_cache_size=4500 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j failing transactions after a day of tiny load
Correct. On Wed, Feb 17, 2010 at 12:54 PM, Dmitri Livotov dmi...@livotov.eu wrote: Aha, thanks for the clarification, so is this correct, that for neo4j.properties we only need to consider configuring only the neostore.* properties ? ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Announcing Neo4j 1.0
Friends, After ten years of development, we are happy to finally announce the release of Neo4j 1.0. It's available here: http://neo4j.org/download http://dist.neo4j.org/neo4j-kernel-1.0-binary.zip http://dist.neo4j.org/neo4j-kernel-1.0-binary.tar.gz For the kernel component this release includes documentation updates together with bug fixes for all known bugs. For more information see: http://dist.neo4j.org/CHANGES.txt Also included in this release is the Neo4j index component: http://components.neo4j.org/neo4j-index/ (version 1.0) You can download the kernel and index (together with some other useful components) bundled together in the apoc package: http://dist.neo4j.org/neo4j-apoc-1.0.zip http://dist.neo4j.org/neo4j-apoc-1.0.tar.gz If you are using maven you can depend on the following (group id=org.neo4j): neo4j-apoc 1.0 or individual components: neo4j-kernel 1.0 neo4j-index 1.0 Finally, let us just offer a huge thanks to everyone on this list, on twitter and in the broader community. Without the feedback and energy and passion and interest from all of you guys, all the endless nights of staring through java.nio stacktraces would never be worth it. We truly feel that 2010 is the year of the graph. Let's change the world. -- Regards, The Neo4j team ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j failing transactions after a day of tiny load
Hi, Looks like a commit fails and then the TM tries to rollback the transaction but that also fails. Only thing TM can do then is to block all other running and new transactions from executing until the failed transaction has been resolved. The strange thing is that the original exception that caused the commit to fail is not logged (only the exception thrown on the following rollback call is logged). A commit fail could be caused by an OutOfMemoryError or no more disk space. If an OutOfMemoryError is thrown it could explain why some log messages are missing. Could you try re-running this using the neo4j-kernel 1.0-SNAPSHOT while monitoring heap usage and available disk space? Regards, -Johan On Mon, Feb 15, 2010 at 10:41 AM, Dmitri Livotov dmi...@livotov.eu wrote: Morning ! Past weekend we established a tiny load test with approx 20 threads in total from a single jmeter machine in order to see how the database will work for a long term under a constant load. The test requests were simple: - (r/w) - random node read by primary key, modification of 10 properties and commit - (r/o) - random node read by primary key, traverse and iterate traversal results We run jmeter on Friday evening (19:00) and database failed at Satturday, about 14:00. After restarting the app server around 16:00 we run the tests again and database failed on Sunday, about 19:00. The diagnostics are strange - suddenly it fails to begin a new transaction and says Unable to start transaction. No more extra messages and stacktraces but this one. Today we crawled our server logs once again and here how it fails in more details: Suddenly, org.neo4j.kernel.impl.transaction.TransactionFailureException: Unable to commit transaction. Caused by: javax.transaction.HeuristicMixedException: Unable to rollback --- error code in commit: -1 --- error code for rollback: 0 error appears. Then, all subsequent requests fails with Unable to start transaction. Only a JVM restart solves the problem - if we just redeploy the webapp, neo4j will not start, yelling on impossibility to obtain a lock to database files - the same message if you try to run two neo4j instances with a same database folder. So it looks like some thread keeps sitting and running in memory, locking the data files. Im including below the beginning of server.log from the moment of time when first failure appears. Not sure, if this internal neo4j problem or somethng from JTA, so would appreciate your commetns/suggestions. Hope we'll be able to figure this out. Dmitri P.S. To clarify neo4j instance usage in a webapp - neo4j instance in initialized in within a singleton class. This class is first touched from servlet context listener, when web app starts up, so database gets initialized at that phase. The servlets only using the singleton class to get that neo4j instance. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Question about Neo mistakes wiki page: What's wrong with the synchronization example?
Hi, On Wed, Feb 10, 2010 at 10:48 AM, Thomas Andersson greddbul...@gmail.com wrote: Hi, ... // don't do this synchronized void methodA() { nodeA.setProperty( prop1, 1 ); methodB(); } synchronized void methodB() { nodeB.setProperty( prop2, 2 ); } According to the test, The code above is very deadlock prone when methods A and B are called concurrently.. Exactly why is that? I try to understand what the situation is that should be avoided, but when I run the code in my head I can't figure out when and why the deadlock occurs. All Neo4j API operations must be invoked within a transaction and the modifying operations (such as Node.setProperty) will synchronize access between concurrent transactions (on a node and relationship level). I updated the wiki with a link to http://wiki.neo4j.org/content/Transactions#Isolation that explains this a bit better. To give you a deadlock scenario with the code above: - tx1 modifies nodeB - tx2 calls methodB but blocks on nodeB.setProperty - tx1 calls methodA and will then block when trying to call methodB - deadlock that can not be detected by the Neo4j kernel Regards, -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Popularity sorting
Hi, Since we are throwing out possible solutions on how to do this I want to mention the graph-matching component. The component needs an API brushup and some documentation on how to use it (currently non-existing...) but I think it should solve your problem nicely. Here is a code example: // build pattern that is to be matched in the real graph PatternNode productX = new PatternNode(); PatternNode productY = new PatternNode(); PatternNode customer = new PatternNode(); productX.addPropertyEqualConstraint( name, productX ); customer.createRelationshipTo( productX, PURCHASED ); customer.createRelationshipTo( productY, PURCHASED ); // get the real product X node Node productNode = ... // get productX node somehow, e.g. from index // start match on product X node IterablePatternMatch matches = PatternMatcher.getMatcher().match( productX, productNode ); // for each sub-graph that matches, use that result to calculate popularity // of other products (Y) for ( PatternMatch match : matches ) { Node customerNode = match.getNodeFor( customer ); Node otherProductNode = match.getNodeFor( productY ); // do the popularity calculation using otherProductNode... } So first create the pattern that should be matched, then decide where to start the match in the real graph and finally extract all the results and do the calculation. Adding more constraints just means modifying the pattern to match. For example if we are only interested in calculating the popularity of other products for customers that are members of a specific customer group just add: PatternNode customerGroup = new PatternNode(); customerGroup.addPropertyEqualConstraint( name, rich_people ); customer.createRelationshipTo( customerGroup ); Or, maybe Bill Gates is one of our rich customers and we only want productY's that he has bought: PatternNode bill = new PatternNode(); bill.addPropertyEqualConstraint( name, Bill Gates ); bill.createRelationshipTo( productY, PURCHASED ); We could now change where to start the match (both productX and bill will only map to a single node) and depending on how the graph looks it may be more efficient to perform the match from the node representing Bill Gates: // start match from Bill Gates Node billGatesNode = ... // get him somehow IterablePatternMatch matches = PatternMatcher.getMatcher().match( bill, billGatesNode ); Hopefully this is enough to get started with the graph-matching component. I personally think it has a lot of potential! Regards, -Johan On Wed, Feb 10, 2010 at 1:32 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: ... On Wed, Feb 10, 2010 at 1:27 PM, Craig Taverner cr...@amanzi.com wrote: ... On Wed, Feb 10, 2010 at 1:10 PM, rick.bullo...@burningskysoftware.comwrote: ... Original Message Subject: [Neo] Popularity sorting From: Chris Owen co...@thoughtworks.com Date: Wed, February 10, 2010 3:05 am To: user@lists.neo4j.org Hi, One of the questions that we were asked to try and answer was: *based on Product X find other Products Customers bought who also bought Product X. * This is quite simple to traverse, but what we were now trying to answer a very similar question of: *based on Product X find other Products Customers have bought who also bought Product X and order by Popularity. * We have not managed to find a way without altering the internal traverses of Neo4J to be able to see how many times a node is found as by default it ignores duplicates. Any ideas of how we solve this problem? Cheers, Chris. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Node with the millions of incoming relationships to - is this a proper way ?
On Wed, Feb 10, 2010 at 2:35 PM, Dmitri Livotov dmi...@livotov.eu wrote: Thanks for all your detailed responses. We now moved forward for stress testing it initial results shows a quite well performace, we're now running random reads, traversing and updates from 100 parallel threads and average response on neo4j with default (read no) memory settings on a developer machine is about 300-400 ms. We're going to adjust memory options and polish test cases and run this on a production server under a typical production load of our current, SQL-bases system, which handles about 500...1000 concurrent users daily. If anyone interested, I'll publish the results. Any suggestions and hints for fine-tuning neo4j for high load as well for another test cases to perform would be also very appreciated. I would suggest the following: o Proper memory mapped configuration (if traversal heavy try get full on node and relationship store files) o Heap should be sized in such a way so peak loads do not result in all CPU time being spent in GC o JVM started with -server o Make sure parallel/concurrent garbage collector is running (we found that -XX:+UseConcMarkSweepGC works well in most use-cases) For more information see: http://wiki.neo4j.org/content/Configuration_Settings Regards, -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Dijkstra search on a big graph
Hi, On Fri, Feb 5, 2010 at 6:36 PM, Anton Popov popov.ua.w...@gmail.com wrote: Hello all, I'm still doing my tests on Neo4J. I've imported some data to Neo4J database trying to search a shortest path search using Dijkstra implementation from neo4j-graph-algo package. As a result - I get exceptions hanged application - can anyone help me to solve the problem? I've packed the bundle, containing Java code, system output (with Exceptions), neo4j.properties and results of top command. It's available here: https://docs.google.com/leaf?id=0Bx7diqqg3SSaYzMzZjU1ZGUtNjhjOS00MDg1LWFhYzItOWY0YjU5MTRlODdmhl=en Please forgive me many System.out prints - that's just an test code to be introduced to the system, it's features performance. After the last Exception, written to the output application seems to hang. Used memory is 1848m during that moment. My environment is: - Ubuntu, 4GB of RAM. - Before test starts, top shows that I have 3.4GB of RAM non-used. So almost no other applications run. - I'm using following Neo4J versions: ... The Exceptions actually are: ... Caused by: java.io.IOException: Invalid argument at sun.nio.ch.FileChannelImpl.truncate0(Native Method) at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:728) at org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:53) ... 15 more How large heap have you set? I see that you have allocated 3G for memory mapping that leaves about 512M max left for heap size since OS (and other on OS processes) needs some memory. I have not seen this stacktrace on Linux (used to happen a lot on 32-bit windows versions since the 2G continuous map limit really wasn't 2G). Try lower nodestore memory mapped settings (check how large file is, should not be 1GByte?) and see if that helps. Regards, -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user