[Neo4j] Accessing node properties with batch inserter
Just started using the batch inserter and I think I am missing a basic concept. This code snippet using Neo 1.5 returns a zero-length map. I would expect it to have a single property (MyKey). String storeDir = ./neodb; deleteDirectory(new File(storeDir)); BatchInserter batchInserter = new BatchInserterImpl(storeDir); GraphDatabaseService graph = batchInserter.getGraphDbService(); Transaction transaction = graph.beginTx(); Node node = graph.createNode(); long id = node.getId(); node.setProperty(MyKey, MyValue); transaction.success(); transaction.finish(); MapString,Object properties = batchInserter.getNodeProperties(id);// properties is empty I added the Transaction stuff for testing, but was expecting that to not be necessary as well. Thanks, Paul Jackson, Principal Software Engineer Pitney Bowes Software 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. ___ NOTICE: THIS MAILING LIST IS BEING SWITCHED TO GOOGLE GROUPS, please register and consider posting at https://groups.google.com/forum/#!forum/neo4j Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Database left in locked state when an exception is thrown during upgrade
This is not a huge deal, since the real problem is that the database cannot be updated, but I thought I would share in case this can occur in other scenarios. In my case, I have a database that I created with build 1.5m02 that I am opening with release 1.5. According to the error message, I must not have shut down the database properly prior to upgrade. If I attempt this a second time in the same process I get a different exception that implies the database is still locked. It seems to me like this operation should have been attempted inside a try block with a finally block that performs an unlock. Here are the stack traces: First attempt: Caused by: org.neo4j.graphdb.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:158) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:105) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:190) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:80) at com.g1.dcg.graph.neo4j.NeoGraph.init(NeoGraph.java:128) ... 43 more Caused by: java.lang.IllegalStateException: Mismatching store version found (Uknown while expecting v0.A.0) and the store is not cleanly shutdown. Recover the database with the previous database version and then attempt to upgrade at org.neo4j.kernel.impl.nioneo.store.NeoStore.checkVersion(NeoStore.java:125) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.init(CommonAbstractStore.java:104) at org.neo4j.kernel.impl.nioneo.store.AbstractStore.init(AbstractStore.java:120) at org.neo4j.kernel.impl.nioneo.store.NeoStore.init(NeoStore.java:78) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:165) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:77) at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:152) ... 47 more Second attempt in same process: Caused by: org.neo4j.graphdb.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:158) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:105) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:190) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:80) at com.g1.dcg.graph.neo4j.NeoGraph.init(NeoGraph.java:128) ... 43 more Caused by: java.lang.IllegalStateException: Unable to lock store [E:\Spectrum\server\modules\graph\db\graph\neostore], this is usually a result of some other Neo4j kernel running using the same store. at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.checkStorage(CommonAbstractStore.java:175) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.init(CommonAbstractStore.java:103) at org.neo4j.kernel.impl.nioneo.store.AbstractStore.init(AbstractStore.java:120) at org.neo4j.kernel.impl.nioneo.store.NeoStore.init(NeoStore.java:78) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:165) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:77) at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:152) ... 47 more Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunity(tm) Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information
[Neo4j] Persisting community information
Suppose I have a graph and I run a community detection algorithm on it. These algorithms usually return a dendrogram, representing the division of the graph from whole network to individual nodes. Does anyone have experience persisting these results? I suppose it could be stored as a separate graph, but is there a way to store it within the graph itself? Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunity(tm) Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Exception when converting older graph
I have a graph that was created with 1.4.M05 that I am trying to open with 1.5.M02. Is this supported? I get this exception: Caused by: org.neo4j.graphdb.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:153) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:112) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:190) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:80) at com.g1.dcg.graph.neo4j.NeoGraph.init(NeoGraph.java:124) ... 42 more Caused by: java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:249) at org.neo4j.kernel.impl.nioneo.xa.Command.readDynamicRecord(Command.java:253) at org.neo4j.kernel.impl.nioneo.xa.Command$RelationshipTypeCommand.readCommand(Command.java:957) at org.neo4j.kernel.impl.nioneo.xa.Command.readCommand(Command.java:1004) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource$CommandFactory.readCommand(NeoStoreXaDataSource.java:302) at org.neo4j.kernel.impl.transaction.xaframework.LogIoUtils.readTxCommandEntry(LogIoUtils.java:157) at org.neo4j.kernel.impl.transaction.xaframework.LogIoUtils.readLogEntry(LogIoUtils.java:99) at org.neo4j.kernel.impl.transaction.xaframework.LogIoUtils.readEntry(LogIoUtils.java:76) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.readEntry(XaLogicalLog.java:866) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.doInternalRecovery(XaLogicalLog.java:796) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.open(XaLogicalLog.java:238) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.open(XaLogicalLog.java:192) at org.neo4j.kernel.impl.transaction.xaframework.XaContainer.openLogicalLog(XaContainer.java:97) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:147) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:75) at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:147) ... 46 more The values in the readDynamicRecord method at the time of the call are: static = org.neo4j.kernel.impl.nioneo.xa.Command byteChannel = {org.neo4j.kernel.impl.util.BufferedFileChannel@53535} buffer = {java.nio.DirectByteBuffer@33560}java.nio.DirectByteBuffer[pos=12 lim=12 cap=713] id = 1 type = 0 inUseFlag = 1 inUse = true record = {org.neo4j.kernel.impl.nioneo.store.DynamicRecord@63952}DynamicRecord[1,true,isLight,-1] nrOfBytes = -1 nextBlock = -4294967280 Thanks. Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunity(tm) Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Multiple threads sharing a transaction
I encounter an error when attempting the support multiple threads. I have a graph manager singleton that controls access to a graph and handles the batching of transactions (committing them only after a set number of operations). Multiple threads can perform read and write operations concurrently, but the graph manager protect concurrent access thought the use of read/write locks. What I am finding is that a while thread can see a node that was created by another transaction before the transaction completes, it is not able to see the properties that were written to it. The step that are followed are: Thread 1: Create a node Write a property (called _stp_type) to the node Save the node's ID to a map of ID's by _stp_id (yet another node property) (the transaction is NOT committed) Thread 2: (using the same transaction instance, possibly created by another thread) Look in the map for the ID of a node with a given _stp_id Get the node from the graph by ID (graph.getNodeById(id)) Get the _stp_type of that node (graph.getProperty(_stp_type)) Caused by: org.neo4j.graphdb.NotFoundException: _stp_type property not found for NodeImpl#4. at org.neo4j.kernel.impl.core.Primitive.newPropertyNotFoundException(Primitive.java:172) at org.neo4j.kernel.impl.core.Primitive.getProperty(Primitive.java:167) at org.neo4j.kernel.impl.core.NodeProxy.getProperty(NodeProxy.java:145) at com.g1.dcg.graph.neo4j.NeoNode.init(NeoNode.java:26) at com.g1.dcg.graph.neo4j.NeoGraph.toDcgNode(NeoGraph.java:818) at com.g1.dcg.graph.neo4j.NeoGraph.getNode(NeoGraph.java:1029) at com.g1.dcg.graph.neo4j.NeoGraphAutoTx.getNode(NeoGraphAutoTx.java:198) at com.g1.component.graph.WriteToGraphStage.execute(WriteToGraphStage.java:93) ... 7 more I am perfectly willing to accept that what I am doing should not work; I'm just a little thrown by the fact the I can see the node by not the properties. Wondering if it is a bug or an invalid use case. Also, if the proper approach is for each thread to create its own transaction, then am I correct to assume that uncommitted changes in one thread will not be visible to the other? Is it then the case that the only way for two threads to be consistent is to ensure that one thread commits before any other starts its own transaction? Thanks. Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Unable to upgrade neostore
I did not. If this is what is required then you have answered my question. Thanks. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Adriano Henrique de Almeida Sent: Tuesday, July 05, 2011 10:59 PM To: Neo4j user discussions Subject: Re: [Neo4j] Unable to upgrade neostore Paul, Did you try to upgrade to 1.2, then to 1.3 and then to 1.4 before going from the 1.1 straight to the 1.4? Regards 2011/7/5 Paul A. Jackson paul.jack...@pb.com I have a neo4j 1.1 graph that I tried opening with 1.4M5. I had a configuration that contained allow_store_upgrade=true: [15] = {java.util.HashMap$Entry@12374} allow_store_upgrade - true key: java.lang.String = {java.lang.String@12376}allow_store_upgrade value: java.lang.String = {java.lang.String@12380}true And I get this exception: jvm 1| Caused by: org.neo4j.graphdb.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error jvm 1| at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:153) jvm 1| at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:111) jvm 1| at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:189) jvm 1| at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:79) jvm 1| at com.g1.dcg.graph.neo4j.NeoGraph.init(NeoGraph.java:118) jvm 1| ... 12 more jvm 1| Caused by: org.neo4j.kernel.impl.nioneo.store.IllegalStoreVersionException: Store version [NeoStore v0.9.5]. Please make sure you are not running old Neo4j kernel on a store that has been created by newer version of Neo4j. jvm 1| at org.neo4j.kernel.impl.nioneo.store.NeoStore.versionFound(NeoStore.java:431) jvm 1| at org.neo4j.kernel.impl.nioneo.store.AbstractStore.loadStorage(AbstractStore.java:147) jvm 1| at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.init(CommonAbstractStore.java:170) jvm 1| at org.neo4j.kernel.impl.nioneo.store.AbstractStore.init(AbstractStore.java:120) jvm 1| at org.neo4j.kernel.impl.nioneo.store.NeoStore.init(NeoStore.java:65) jvm 1| at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:132) jvm 1| at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) jvm 1| at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) jvm 1| at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) jvm 1| at java.lang.reflect.Constructor.newInstance(Constructor.java:513) jvm 1| at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:75) jvm 1| at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:147) jvm 1| ... 16 more My main question is whether this is supported or I am doing something wrong. I don't really need to support the upgrade of version 1.1 databases, but I want to make sure my code is correct so that I will be able to support upgrades in the future. Thanks. Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Adriano Almeida Caelum | Ensino e Inovação www.caelum.com.br ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Unable to upgrade neostore
I have a neo4j 1.1 graph that I tried opening with 1.4M5. I had a configuration that contained allow_store_upgrade=true: [15] = {java.util.HashMap$Entry@12374} allow_store_upgrade - true key: java.lang.String = {java.lang.String@12376}allow_store_upgrade value: java.lang.String = {java.lang.String@12380}true And I get this exception: jvm 1 | Caused by: org.neo4j.graphdb.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error jvm 1 | at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:153) jvm 1 | at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:111) jvm 1 | at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:189) jvm 1 | at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:79) jvm 1 | at com.g1.dcg.graph.neo4j.NeoGraph.init(NeoGraph.java:118) jvm 1 | ... 12 more jvm 1 | Caused by: org.neo4j.kernel.impl.nioneo.store.IllegalStoreVersionException: Store version [NeoStore v0.9.5]. Please make sure you are not running old Neo4j kernel on a store that has been created by newer version of Neo4j. jvm 1 | at org.neo4j.kernel.impl.nioneo.store.NeoStore.versionFound(NeoStore.java:431) jvm 1 | at org.neo4j.kernel.impl.nioneo.store.AbstractStore.loadStorage(AbstractStore.java:147) jvm 1 | at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.init(CommonAbstractStore.java:170) jvm 1 | at org.neo4j.kernel.impl.nioneo.store.AbstractStore.init(AbstractStore.java:120) jvm 1 | at org.neo4j.kernel.impl.nioneo.store.NeoStore.init(NeoStore.java:65) jvm 1 | at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:132) jvm 1 | at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) jvm 1 | at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) jvm 1 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) jvm 1 | at java.lang.reflect.Constructor.newInstance(Constructor.java:513) jvm 1 | at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:75) jvm 1 | at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:147) jvm 1 | ... 16 more My main question is whether this is supported or I am doing something wrong. I don't really need to support the upgrade of version 1.1 databases, but I want to make sure my code is correct so that I will be able to support upgrades in the future. Thanks. Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Beer and Talk
Washington, D.C. Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Andreas Kollegger Sent: Monday, March 14, 2011 8:37 AM To: Neo4j user discussions Subject: Re: [Neo4j] Beer and Talk Anyone on the east coast of the States? Washington, Philly, NYC, maybe even Boston? On Mar 14, 2011, at 1:32 PM, Emil Eifrem wrote: On Mon, Mar 14, 2011 at 12:15, Axel Morgner a...@morgner.de wrote: Hi everybody, as said, here's a new thread for the idea of having beer and talk meetings. Possible locations so far: Malmö London Berlin Frankfurt Let's add San Francisco bay area to that as well! Great initiative! Cheers, -- Emil Eifrém, CEO [e...@neotechnology.com] Neo Technology, www.neotechnology.com Cell: +46 733 462 271 | US: 206 403 8808 http://blogs.neotechnology.com/emil http://twitter.com/emileifrem ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Big index solutions?
Hi Peter, I finished my testing. I tried jdbm tree and map, HSQL, and jboss cache as a wrapper around both HSQL and jdbm. I found that jboss cache doesn't necessarily persist to disk at the end of a transaction, so it fails the acid test. HSQL is super fast in memory but was terrible when forced to commit every transaction. (I tested 1.8, which doesn't support transactions, only each update is a transaction. Maybe 2.0 is better.) So that leave jdbm. The tree (surprisingly) was much faster than the map. I know from experience that jdbm doesn't scale well withy multiple threads, yet, in this application I was thinking it may still be a good fit. It would be nice though if they at least used a ReentrantReadWriteLock rather than method synchronization to allow concurrent reads. Hope that helps. Thanks, -Paul Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: Tuesday, December 21, 2010 11:12 AM To: rick.bullo...@burningskysoftware.com Cc: Neo4j user discussions Subject: Re: [Neo4j] Big index solutions? Mmh, we are looking at JDBM now, and it seems to be promising. Will inform you on the progress of that! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Dec 21, 2010 at 12:19 PM, rick.bullo...@burningskysoftware.com rick.bullo...@burningskysoftware.com wrote: That should fit in RAM just fine, except for the effect of the string block/page size probably. What about a btree backed by neo relationships? Not fast enough? - Reply message - From: Peter Neubauer peter.neuba...@neotechnology.com Date: Mon, Dec 20, 2010 3:54 pm Subject: [Neo4j] Big index solutions? To: Neo4j user discussions user@lists.neo4j.org Hi folks, I wonder if any of you has seen a fast exact index solution that works for the batchinserter (FAST) and over big indexes (like 100M strings of length 20characters) that don't fit in RAM. Lucene is unable to cache such indexes and gets slow. Does anybody have experiences with other reverse lookup solutions like Berkeley DB, Ehcache or others? Would be great to combine them with the batchinserter to be able to fast insert big edge-lists with node-index-lookups into Neo4j ... Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Getting started with Neo4J Spatial
I've been doing some research into alternate storage mechanism for exact match indexes. I excluded BDB from my list because it has a commercial license. I'll share my findings once I have something concrete. Thanks. Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: Friday, February 18, 2011 4:02 PM To: Neo4j user discussions Subject: Re: [Neo4j] Getting started with Neo4J Spatial Guys, I have started looking into the OSM imports. basically, the main problem is to fast insert the nodes into an exact-matching index, and then, when going through the ways, look the nodes up by OSM-ID in that index in order to conenct them to the ways. Lucene is not very good at that, so I have been tinkering with a BerkeleyDB based Neo4j index, see https://github.com/peterneubauer/bdb-index. Then, I removed the batchinserter from the OSM import and instead use a faster index, see the branch at https://github.com/neo4j/neo4j-spatial/tree/embedded-import using BerkeleyDB and not Lucene, and actually passing the basic test suite. This is early work and not at all stable, I am right now trying to profile the import, but this would be something that could improve performance to predictable times. I am testing with Croatia.osm to start with (1.3M nodes, 230K ways) and want to get in Germany.osm (60M nodes, 8M ways). I will be away for a week, but I hope that maybe others can look into profiling the OSM import during this time. Just saying - this is something we are aware of and would highly appreciate help with, so feel free to fork and improve! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Feb 18, 2011 at 9:54 PM, Nolan Darilek no...@thewordnerd.info wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/18/2011 02:49 PM, bryce hendrix wrote: Nolan, The first experience with Neo4j Spatial was with the texas.osm file. I imported it on my notebook I think it took 15 hours, if I remember correctly. I quickly decided to play around with just Austin for the time Oh wow. I'm hoping this is being improved as well? I'm through a bit over 1.6 million ways and all nodes so I'm hoping it won't take 15 hours. being. If you'd like, i can zip up my Austin file (just Austin, not any of the suburbs) and send it to you. Austin takes about 30 secs to import and index. That'd be great, especially as I'm in Austin and can actually use the data live. Thanks. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1e3HsACgkQIaMjFWMehWIcOgCcDvEN+hMfsxtgzLZXAke27tQq YM0AnA8drtsoCTaFQ+gZgR8cV2/fSWxG =Y3TB -END PGP SIGNATURE- ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Getting started with Neo4J Spatial
Yes, in my tests I included jdbm hmap, btree, and jboss cache backed by jdbm (presumably btree) and got the best performance with jboss cache, followed closely by btree and hmap a distant third. I would like to test hsql and derby before finishing. Thanks, -Paul Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: Tuesday, February 22, 2011 10:51 AM To: Neo4j user discussions Subject: Re: [Neo4j] Getting started with Neo4J Spatial Paul, Aim considering JDBM also, should be very similar toj the BDB index approach ... That ,something you would like to see? On Tuesday, February 22, 2011, Paul A. Jackson paul.jack...@pb.com wrote: I've been doing some research into alternate storage mechanism for exact match indexes. I excluded BDB from my list because it has a commercial license. I'll share my findings once I have something concrete. Thanks. Paul Jackson, Principal Software Engineer Pitney Bowes Business Insight 4200 Parliament Place | Suite 600 | Lanham, MD 20706-1844 USA O: 301.918.0850 | M: 703.862.0120 | www.pb.com paul.jack...@pb.com Every connection is a new opportunityT Please consider the environment before printing or forwarding this email. If you do print this email, please recycle the paper. This email message may contain confidential, proprietary and/or privileged information. It is intended only for the use of the intended recipient(s). If you have received it in error, please immediately advise the sender by reply email and then delete this email message. Any disclosure, copying, distribution or use of the information contained in this email message to or by anyone other than the intended recipient is strictly prohibited. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of the Company. -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: Friday, February 18, 2011 4:02 PM To: Neo4j user discussions Subject: Re: [Neo4j] Getting started with Neo4J Spatial Guys, I have started looking into the OSM imports. basically, the main problem is to fast insert the nodes into an exact-matching index, and then, when going through the ways, look the nodes up by OSM-ID in that index in order to conenct them to the ways. Lucene is not very good at that, so I have been tinkering with a BerkeleyDB based Neo4j index, see https://github.com/peterneubauer/bdb-index. Then, I removed the batchinserter from the OSM import and instead use a faster index, see the branch at https://github.com/neo4j/neo4j-spatial/tree/embedded-import using BerkeleyDB and not Lucene, and actually passing the basic test suite. This is early work and not at all stable, I am right now trying to profile the import, but this would be something that could improve performance to predictable times. I am testing with Croatia.osm to start with (1.3M nodes, 230K ways) and want to get in Germany.osm (60M nodes, 8M ways). I will be away for a week, but I hope that maybe others can look into profiling the OSM import during this time. Just saying - this is something we are aware of and would highly appreciate help with, so feel free to fork and improve! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Feb 18, 2011 at 9:54 PM, Nolan Darilek no...@thewordnerd.info wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/18/2011 02:49 PM, bryce hendrix wrote: Nolan, The first
Re: [Neo4j] Better support for large property data
Does it go without saying that when this is implemented that a neo instance would still be able to open a graph from a prior version? Would this be an automatic one-time conversion, or would there be a utility that would convert from one format to the other, or something else? Thanks, Paul Jackson -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson Sent: Friday, February 18, 2011 9:20 AM To: Neo user discussions Subject: [Neo4j] Better support for large property data Having tackled short strings, I feel up for taking a stab at long strings, and large binary data objects. I know that Rick Bullotta is really interested in this, and I can imagine others wanting to store large properties as well. I would love to get your input on the ideas I have, as well as hearing about the ideas you might have. The way I see it there are two different kinds of large data objects. The first one is long strings, or text. Imagine building a blog engine on Neo4j, the text body of a blog post is likely going to be around a thousand characters. That is a lot of blocks in the DynamicStringStore. But you still want to support shorter strings (the title of the post for example), without much overhead, so you don't want to increase the block size for the DynamicStringStore. In your code you want to deal with these values as String objects though, you don't want a different object type just because the string happens to be longer. The second one is large binary data objects. Data objects that are too large to want to have allocated as a String object, or even as a byte[] object. You want to manipulate them through some sort of streaming interface. These data objects are also so large that you would prefer if their content wasn't written to the transaction logs, because that would mean that Neo4j needed to rotate the log extremely frequently, and since you keep the logical logs for HA and backup, it would fill up your disks twice as quickly as it needed. Properties like this would, for example, be used for storing images that are included in the blog posts. For long Strings (the first point), the solution I'm thinking of is to replace the stringstore and arraystore with a smallstore and a largestore. Both being dynamic block stores as they are today, but with different block sizes. Then store both arrays and strings in both of these stores. The type of the data stored in the block is stored in the property record for the property that references the blocks anyhow, so there isn't a great advantage of having different block stores for strings and arrays. For BLOBs (the second point), we need additions to the API, since you want to work with these things in a streaming fashion. I am thinking that we use java.nio.channels.ReadableByteChannel for these properties. Why ReadableByteChannel you ask? Why not InputStream? First reason: InputStream can be converted to ReadableByteChannel, and vice versa: http://download.oracle.com/javase/6/docs/api/index.html?java/nio/channels/Channels.html Second reason: ReadableByteChannel is a really simple interface (only three methods) if you want to write your own custom implementation. Setting a BLOB property would then look like this: ReadableByteChannel myBlob = ... node.setProperty(a_blob, myBlob); Getting would look like this: ReadableByteChannel myBlob = (ReadableByteChannel)node.getProperty(a_blob); Perhaps we could then, also come up with some nice API for appending to a BLOB property: ReadableByteChannel moreData = ... ReadableByteChannel myBlob = (ReadableByteChannel)node.getProperty(a_blob); node.setProperty( a_blob, BlobUtils.append(myBlob, moreData) ); Comment please. -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Help with exception
I've been doing some performance and scalability testing with large graphs (2,000,000 nodes, 5,000,000 edges - actually, the WikiTalk data from the Stanford Snap site). I must have shut down my server improperly because a number of graphs needed to recover when I started it back up, but this largest of the graphs didn't appear to recover properly. I now get the following exception when I attempt to load data from the graph. I am still using neo4j 1.1. Can anyone say what the exception means? Corrupt database, throw it away and rebuild it? Exception in thread com.g1.dcg.graph.neo4j.NeoEigenvectorJob:2 java.lang.RuntimeException: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to load position[146166] @[1315494] at com.g1.dcg.graph.job.AbstractGraphJob.run(AbstractGraphJob.java:59) at java.lang.Thread.run(Thread.java:619) Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to load position[146166] @[1315494] at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.readPosition(PersistenceRow.java:101) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:152) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:474) at org.neo4j.kernel.impl.nioneo.store.NodeStore.loadLightNode(NodeStore.java:131) at org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.nodeLoadLight(ReadTransaction.java:74) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.nodeLoadLight(NioNeoDbPersistenceSource.java:235) at org.neo4j.kernel.impl.persistence.PersistenceManager.loadLightNode(PersistenceManager.java:74) at org.neo4j.kernel.impl.core.NodeManager.getNodeById(NodeManager.java:391) at org.neo4j.kernel.EmbeddedGraphDbImpl.getNodeById(EmbeddedGraphDbImpl.java:223) at org.neo4j.kernel.EmbeddedGraphDbImpl$AllNodesIterator.hasNext(EmbeddedGraphDbImpl.java:426) at com.g1.dcg.graph.neo4j.NeoEigenvectorJob.runJob(NeoEigenvectorJob.java:72) at com.g1.dcg.graph.job.AbstractGraphJob.run(AbstractGraphJob.java:51) ... 1 more Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88) at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:299) at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.readPosition(PersistenceRow.java:80) ... 12 more Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Help with exception
This happens with a freshly started database. I don't see a file with the name that you mention. All I have are the following files with the word log in them: graph.WikiTalk\active_tx_log graph.WikiTalk\nioneo_logical.log.1 graph.WikiTalk\nioneo_logical.log.active graph.WikiTalk\tm_tx_log.1 graph.WikiTalk\tm_tx_log.2 graph.WikiTalk\lucene\lucene.log.1 graph.WikiTalk\lucene\lucene.log.active graph.WikiTalk\lucene-fulltext\lucene.log.1 graph.WikiTalk\lucene-fulltext\lucene.log.active Thanks, -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson Sent: Tuesday, February 01, 2011 10:58 AM To: Neo4j user discussions Subject: Re: [Neo4j] Help with exception Are threads being interrupted anywhere in your application? There is an issue in java.nio that it closes open channels if interrupted during an operation. This means that io-channels that are shared between multiple threads gets closed if one of the threads using it gets interrupted. Or does this happen with a freshly started application? If you could send over the messages.log file that might shed some light on what has happened to your store. Cheers, -tobias On Tue, Feb 1, 2011 at 4:43 PM, Paul A. Jackson paul.jack...@pb.com wrote: I've been doing some performance and scalability testing with large graphs (2,000,000 nodes, 5,000,000 edges - actually, the WikiTalk data from the Stanford Snap site). I must have shut down my server improperly because a number of graphs needed to recover when I started it back up, but this largest of the graphs didn't appear to recover properly. I now get the following exception when I attempt to load data from the graph. I am still using neo4j 1.1. Can anyone say what the exception means? Corrupt database, throw it away and rebuild it? Exception in thread com.g1.dcg.graph.neo4j.NeoEigenvectorJob:2 java.lang.RuntimeException: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to load position[146166] @[1315494] at com.g1.dcg.graph.job.AbstractGraphJob.run(AbstractGraphJob.java:59) at java.lang.Thread.run(Thread.java:619) Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to load position[146166] @[1315494] at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.readPosition(PersistenceRow.java:101) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:152) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:474) at org.neo4j.kernel.impl.nioneo.store.NodeStore.loadLightNode(NodeStore.java:131) at org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.nodeLoadLight(ReadTransaction.java:74) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.nodeLoadLight(NioNeoDbPersistenceSource.java:235) at org.neo4j.kernel.impl.persistence.PersistenceManager.loadLightNode(PersistenceManager.java:74) at org.neo4j.kernel.impl.core.NodeManager.getNodeById(NodeManager.java:391) at org.neo4j.kernel.EmbeddedGraphDbImpl.getNodeById(EmbeddedGraphDbImpl.java:223) at org.neo4j.kernel.EmbeddedGraphDbImpl$AllNodesIterator.hasNext(EmbeddedGraphDbImpl.java:426) at com.g1.dcg.graph.neo4j.NeoEigenvectorJob.runJob(NeoEigenvectorJob.java:72) at com.g1.dcg.graph.job.AbstractGraphJob.run(AbstractGraphJob.java:51) ... 1 more Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88) at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:299) at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.readPosition(PersistenceRow.java:80) ... 12 more Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Help with exception
I can not think of a scenario where #1 could happen, but it seems very likely that #2 could happen. I'll add some safeguards that prevent that from happening in the future. Thanks a lot for the help. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson Sent: Tuesday, February 01, 2011 11:56 AM To: Neo4j user discussions Subject: Re: [Neo4j] Help with exception Oh, sorry, I missed that you were still using 1.1, messages.log was introduced after 1.1 was released. However, the exception you supplied does not say that the store is corrupt, just that it cannot read it, because it has been closed. There are only two scenarios I can think of where this can happen: 1. Concurrent invocation of grapdb.shutdown() 2. A thread was interrupted while doing IO operations on that same file. Any chance your code could have already called graphdb.shutdown()? For example a main() that starts a bunch of threads then doesn't wait for them to finish but instead shuts down the graph database? Cheers, Tobias On Tue, Feb 1, 2011 at 5:09 PM, Paul A. Jackson paul.jack...@pb.com wrote: This happens with a freshly started database. I don't see a file with the name that you mention. All I have are the following files with the word log in them: graph.WikiTalk\active_tx_log graph.WikiTalk\nioneo_logical.log.1 graph.WikiTalk\nioneo_logical.log.active graph.WikiTalk\tm_tx_log.1 graph.WikiTalk\tm_tx_log.2 graph.WikiTalk\lucene\lucene.log.1 graph.WikiTalk\lucene\lucene.log.active graph.WikiTalk\lucene-fulltext\lucene.log.1 graph.WikiTalk\lucene-fulltext\lucene.log.active Thanks, -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson Sent: Tuesday, February 01, 2011 10:58 AM To: Neo4j user discussions Subject: Re: [Neo4j] Help with exception Are threads being interrupted anywhere in your application? There is an issue in java.nio that it closes open channels if interrupted during an operation. This means that io-channels that are shared between multiple threads gets closed if one of the threads using it gets interrupted. Or does this happen with a freshly started application? If you could send over the messages.log file that might shed some light on what has happened to your store. Cheers, -tobias On Tue, Feb 1, 2011 at 4:43 PM, Paul A. Jackson paul.jack...@pb.com wrote: I've been doing some performance and scalability testing with large graphs (2,000,000 nodes, 5,000,000 edges - actually, the WikiTalk data from the Stanford Snap site). I must have shut down my server improperly because a number of graphs needed to recover when I started it back up, but this largest of the graphs didn't appear to recover properly. I now get the following exception when I attempt to load data from the graph. I am still using neo4j 1.1. Can anyone say what the exception means? Corrupt database, throw it away and rebuild it? Exception in thread com.g1.dcg.graph.neo4j.NeoEigenvectorJob:2 java.lang.RuntimeException: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to load position[146166] @[1315494] at com.g1.dcg.graph.job.AbstractGraphJob.run(AbstractGraphJob.java:59) at java.lang.Thread.run(Thread.java:619) Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: Unable to load position[146166] @[1315494] at org.neo4j.kernel.impl.nioneo.store.PersistenceRow.readPosition(PersistenceRow.java:101) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:152) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:474) at org.neo4j.kernel.impl.nioneo.store.NodeStore.loadLightNode(NodeStore.java:131) at org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.nodeLoadLight(ReadTransaction.java:74) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.nodeLoadLight(NioNeoDbPersistenceSource.java:235) at org.neo4j.kernel.impl.persistence.PersistenceManager.loadLightNode(PersistenceManager.java:74) at org.neo4j.kernel.impl.core.NodeManager.getNodeById(NodeManager.java:391) at org.neo4j.kernel.EmbeddedGraphDbImpl.getNodeById(EmbeddedGraphDbImpl.java:223) at org.neo4j.kernel.EmbeddedGraphDbImpl$AllNodesIterator.hasNext(EmbeddedGraphDbImpl.java:426) at com.g1.dcg.graph.neo4j.NeoEigenvectorJob.runJob(NeoEigenvectorJob.java:72) at com.g1.dcg.graph.job.AbstractGraphJob.run(AbstractGraphJob.java:51) ... 1 more Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88) at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:299
Re: [Neo4j] Big index solutions?
I do not have any direct experience but was wondering if anyone has experience with Jboss Cache over JDBM and could speculate on it's applicability. Also, I would like to see this fast exact indexer available with GraphDatabaseService, not just BatchInserter, as I am not able to use the BatchInserter because I need to be able to query the graph and update existing nodes during loads. Thanks, -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: Monday, December 20, 2010 3:54 PM To: Neo4j user discussions Subject: [Neo4j] Big index solutions? Hi folks, I wonder if any of you has seen a fast exact index solution that works for the batchinserter (FAST) and over big indexes (like 100M strings of length 20characters) that don't fit in RAM. Lucene is unable to cache such indexes and gets slow. Does anybody have experiences with other reverse lookup solutions like Berkeley DB, Ehcache or others? Would be great to combine them with the batchinserter to be able to fast insert big edge-lists with node-index-lookups into Neo4j ... Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Eigenvector Centrality subclasses
I will look into adding a deterministic property that defaults to false for backward compatibility and test to see that the deterministic results are reasonable. I haven't built neo4j before so I can't commit to the success of this attempt. -Paul -Original Message- From: neubauer.pe...@gmail.com [mailto:neubauer.pe...@gmail.com] On Behalf Of Peter Neubauer Sent: Wednesday, November 10, 2010 3:11 AM To: Neo4j user discussions Cc: Paul A. Jackson Subject: Re: [Neo4j] Eigenvector Centrality subclasses Paul, Marko, could you do a test on if the new Random(0) would be a good change? I am not really into that algo, so I think you could do a much better job there, given your expertise! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Nov 10, 2010 at 12:19 AM, Paul A. Jackson paul.jack...@pb.com wrote: Perhaps if new Random( System.currentTimeMillis() ) we replaced with new Random( 0 ), you would get the benefits of pseudo random behavior but also deterministic results from run to run. -Paul -Original Message- From: Paul A. Jackson Sent: Tuesday, November 09, 2010 6:16 PM To: 'Neo4j user discussions' Subject: RE: [Neo4j] Eigenvector Centrality subclasses I'm using: import org.neo4j.graphalgo.impl.centrality.EigenvectorCentrality; import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityArnoldi; import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityPower; The variance I am seeing is far greater than anything that could be explained by floating point precision issues. For example, a result coming back after one call as 0.045 and then on the next call with identical options it could return 0.038. I glanced over the code and I see that they both use java.util.Random, so that could explain why it is not deterministic. Maybe that answers everything. Unfortunately, what it means is that you might randomly have two subsequent calls that appear to return similar results, but actually you have not zeroed in on the correct answer within the actual level of precision that is desired. The JavaDoc explicitly states that precision doesn't means proximity to correct result, but it doesn't make the results less unsatisfying. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez Sent: Tuesday, November 09, 2010 6:06 PM To: Neo4j user discussions Subject: Re: [Neo4j] Eigenvector Centrality subclasses Hey Paul, I get inconsistent results from run to run using eigenvector centrality. It doesn't seem to matter which implementation I use but I have used Arnoldi most, for no reason other than it returns the iteration count. Given that eigenvector components sum to 1, and when dealing with large graphs, you may be running into floating point precision issues. In general, different eigenvector methods may have small variations in their values (even though its the same eigenvector!), but, if you are getting Spearman rank order correlation ~1.0, then I think its 'all good.' Also, note that for those eigenvector centrality implementations that are based on random walk, variations are sure to show up. The iteration count is not consistent from run to run when run against the exact same graph using the exact same precision. In a graph with 32 nodes and 117 edges, I get anywhere from 18 to 24 iterations needed to get a precision of 0.001. The variance is easier to see when the test is run on different computers. Hmm... What code are you using? I'm talking in general and not specifically about anything Neo4j related... Thanks, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Eigenvector Centrality subclasses
Not sure if this fell through the cracks. Here are some more specific questions. I get inconsistent results from run to run using eigenvector centrality. It doesn't seem to matter which implementation I use but I have used Arnoldi most, for no reason other than it returns the iteration count. The iteration count is not consistent from run to run when run against the exact same graph using the exact same precision. In a graph with 32 nodes and 117 edges, I get anywhere from 18 to 24 iterations needed to get a precision of 0.001. The variance is easier to see when the test is run on different computers. Also, I experience the same problem as Piyush below. Not sure if anything ever came from this: On Wed, Jul 28, 2010 at 10:20 AM, Piyush Kanti Bhunre kbpiy...@gmail.com wrote: Hi, I am getting some negative values of centrality of nodes of a network using Neo4j's EigenvectorCentralityArnoldi. I am using this for the small networks having few thousands nodes. I am not sure if it is due to instability of the algorithm or bugs in implementation. Could you please comment on that? Thanks. Piyush Thanks, -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Paul A. Jackson Sent: Monday, November 08, 2010 10:10 AM To: Neo4j user discussions Subject: [Neo4j] Eigenvector Centrality subclasses Anyone know the pros/cons of the Arnoldi eigenvector centrality implementation over the Power implementation? I see that Arnoldi gives a little more information on number of iterations, but it seems neither is deterministic. Thanks, Paul Jackson Pitney Bowes ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Eigenvector Centrality subclasses
I'm using: import org.neo4j.graphalgo.impl.centrality.EigenvectorCentrality; import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityArnoldi; import org.neo4j.graphalgo.impl.centrality.EigenvectorCentralityPower; The variance I am seeing is far greater than anything that could be explained by floating point precision issues. For example, a result coming back after one call as 0.045 and then on the next call with identical options it could return 0.038. I glanced over the code and I see that they both use java.util.Random, so that could explain why it is not deterministic. Maybe that answers everything. Unfortunately, what it means is that you might randomly have two subsequent calls that appear to return similar results, but actually you have not zeroed in on the correct answer within the actual level of precision that is desired. The JavaDoc explicitly states that precision doesn't means proximity to correct result, but it doesn't make the results less unsatisfying. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez Sent: Tuesday, November 09, 2010 6:06 PM To: Neo4j user discussions Subject: Re: [Neo4j] Eigenvector Centrality subclasses Hey Paul, I get inconsistent results from run to run using eigenvector centrality. It doesn't seem to matter which implementation I use but I have used Arnoldi most, for no reason other than it returns the iteration count. Given that eigenvector components sum to 1, and when dealing with large graphs, you may be running into floating point precision issues. In general, different eigenvector methods may have small variations in their values (even though its the same eigenvector!), but, if you are getting Spearman rank order correlation ~1.0, then I think its 'all good.' Also, note that for those eigenvector centrality implementations that are based on random walk, variations are sure to show up. The iteration count is not consistent from run to run when run against the exact same graph using the exact same precision. In a graph with 32 nodes and 117 edges, I get anywhere from 18 to 24 iterations needed to get a precision of 0.001. The variance is easier to see when the test is run on different computers. Hmm... What code are you using? I'm talking in general and not specifically about anything Neo4j related... Thanks, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Eigenvector Centrality subclasses
Anyone know the pros/cons of the Arnoldi eigenvector centrality implementation over the Power implementation? I see that Arnoldi gives a little more information on number of iterations, but it seems neither is deterministic. Thanks, Paul Jackson Pitney Bowes ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Exception when adding a Comment property to a node
All, Sorry for the false alarm. It turns out that the actual field name that I was using was Comment\n (literal newline at end of string), not Comment. We do not have a requirement to index properties with newlines in their name, so you can ignore this post. Thanks for looking it to it. -Paul If anyone still wants the test program, here it is: import org.neo4j.graphdb.GraphDatabaseService; import org.neo4j.graphdb.Node; import org.neo4j.graphdb.Transaction; import org.neo4j.index.lucene.LuceneIndexService; import org.neo4j.kernel.EmbeddedGraphDatabase; public class Comment { private static GraphDatabaseService graph; private static LuceneIndexService indexService; public static void main(String[] args) { graph = new EmbeddedGraphDatabase(./graph.comment); indexService = new LuceneIndexService(graph); Transaction tx = null; try { Node node; tx = graph.beginTx(); node = graph.createNode(); setProperty(node, Foo, Bar); tx.success(); tx.finish(); tx = graph.beginTx(); node = graph.createNode(); setProperty(node, Comment\n, This should be no different.); tx.success(); tx.finish(); } catch (Throwable e) { e.printStackTrace(); tx.failure(); tx.finish(); } indexService.shutdown(); graph.shutdown(); } private static void setProperty(Node node, String key, String value) { node.setProperty(key, value); indexService.index(node, key, value); } } -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Mattias Persson Sent: Tuesday, October 12, 2010 3:46 AM To: Neo4j user discussions Subject: Re: [Neo4j] Exception when adding a Comment property to a node Seems like your index key contains characters illegal on your current file system... would it be possible to see the code? or could you supply which index keys you use to index your comments? 2010/10/12 Tobias Ivarsson tobias.ivars...@neotechnology.com Hi Paul, Comment is not a reserved word, there are no reserved words for what property keys you can use. Are you also indexing that Comment property? It looks like the exception originates from the index component. If so we need to find a work around for that. I can see that you are running this on windows. I'm wondering if this could be the index component trying to create some file (for storing the index) that your file system rejects. Which versions of Neo4j-kernel and Neo4j-index are you using? Have you isolated this so that it is only a Comment property being indexed, or are there other indexes to factor in as well? Cheers, Tobias On Tue, Oct 12, 2010 at 12:49 AM, Paul A. Jackson paul.jack...@pb.com wrote: Using Neo4j 1.1, when I create a node that has a Comment property, I get the following exception when I commit the node: javax.transaction.xa.XAException: Unknown xid[GlobalId[NEOKERNL|1286836445937|4], BranchId[ 52 49 52 49 52 49 ]] at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.rollback(XaResourceManager.java:416) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.rollback(XaResourceHelpImpl.java:111) at org.neo4j.kernel.impl.transaction.TransactionImpl.doRollback(TransactionImpl.java:533) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:616) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:561) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:104) at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:560) at com.g1.dcg.graph.neo4j.NeoGraph.newNeoNode(NeoGraph.java:167) Oct 11, 2010 6:34:06 PM org.neo4j.kernel.impl.transaction.TxManager commit SEVERE: Unable to rollback transaction. Some resources may be commited others not. Neo4j kernel should be SHUTDOWN for resource maintance and transaction recovery java.lang.RuntimeException: java.io.IOException: The filename, directory name, or volume label syntax is incorrect at org.neo4j.index.lucene.LuceneDataSource.getIndexWriter(LuceneDataSource.java:398) at org.neo4j.index.lucene.LuceneTransaction.doCommit(LuceneTransaction.java:207) at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:316) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:399) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64) at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:516) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:590
[Neo4j] Exception when adding a Comment property to a node
Using Neo4j 1.1, when I create a node that has a Comment property, I get the following exception when I commit the node: javax.transaction.xa.XAException: Unknown xid[GlobalId[NEOKERNL|1286836445937|4], BranchId[ 52 49 52 49 52 49 ]] at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.rollback(XaResourceManager.java:416) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.rollback(XaResourceHelpImpl.java:111) at org.neo4j.kernel.impl.transaction.TransactionImpl.doRollback(TransactionImpl.java:533) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:616) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:561) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:104) at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:560) at com.g1.dcg.graph.neo4j.NeoGraph.newNeoNode(NeoGraph.java:167) Oct 11, 2010 6:34:06 PM org.neo4j.kernel.impl.transaction.TxManager commit SEVERE: Unable to rollback transaction. Some resources may be commited others not. Neo4j kernel should be SHUTDOWN for resource maintance and transaction recovery java.lang.RuntimeException: java.io.IOException: The filename, directory name, or volume label syntax is incorrect at org.neo4j.index.lucene.LuceneDataSource.getIndexWriter(LuceneDataSource.java:398) at org.neo4j.index.lucene.LuceneTransaction.doCommit(LuceneTransaction.java:207) at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:316) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:399) at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64) at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:516) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:590) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:561) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:104) at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:560) at com.g1.dcg.graph.neo4j.NeoGraph.newNeoNode(NeoGraph.java:167) Caused by: java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:396) at java.io.File.getCanonicalPath(File.java:559) at org.apache.lucene.store.FSDirectory.getCanonicalPath(FSDirectory.java:340) at org.apache.lucene.store.FSDirectory.init(FSDirectory.java:381) at org.apache.lucene.store.SimpleFSDirectory.init(SimpleFSDirectory.java:40) at org.apache.lucene.store.FSDirectory.open(FSDirectory.java:424) at org.apache.lucene.store.FSDirectory.open(FSDirectory.java:411) at org.neo4j.index.lucene.LuceneDataSource.getDirectory(LuceneDataSource.java:326) at org.neo4j.index.lucene.LuceneDataSource.getIndexWriter(LuceneDataSource.java:385) ... 19 more Is Comment as reserved word and are there other reserved words? Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Getting a obtain the edge between two specific nodes
I'm looking for an efficient way to find an (the) edge(s) between two nodes. I have a requirement that when I add an edge between two specific nodes that I first determine if the edge already exists, which leads to the need for a method that return such an edge given the subject and object. Since I was unable to find such a method I attempted to create my own, which worked, but has become the performance bottleneck in the code. private Relationship getSingeRelationship(Node subject, Node object, RelationshipType relationshipType, String key) { for (Relationship relationship : getCommonRelationships(subject, object, relationshipType, Direction.OUTGOING)) { boolean currentHasIndex = relationship.hasProperty(EDGE_INDEX_KEY); if (key == null) { if (!currentHasIndex) { return relationship; } } else { if (currentHasIndex key.equals(relationship.getProperty(EDGE_INDEX_KEY))) { return relationship; } } } return null; } ...where getCommonRelationships returns an IterableRelationship and simply loops through all the subject's relationships (of the specified relationship type and direction) until it finds an edge that points to the given object node. The profiler says the 2/3 of the time is spent in org.neo4j.kernel.impl.core.IntArrayIterator.hasNext() and most of the rest is spent in org.neo4j.kernel.impl.core.RelationshipProxy.getOtherNode(Node). I am aware of the lab code that is supposed to support indexed edges and could see using that as a solution, but I was hoping for something native (and merged into the 1.1-SNAPSHOT branch). Are there any better approaches? Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Remove Indexes during BatchInsertion
I have a program for loading data into a graph and would like to support the case where later records contain data for nodes that were defined in prior records. In some cases it is possible that a later record may indicate that a node's property should be null where earlier it was given a value. This causes me to wish I had the removeIndex methods that I have in the non-batch-inserting version of LuceneIndex. Am I out of luck? I recall an earlier discussion where consideration was being given to a version of the batch inserter that implemented the GraphDatabaseService and LuceneIndexService interfaces. Did anything come of that? Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Automating transactions
All, I am interested in encapsulating the business of managing transactions inside a generic graph API. I assume I will have some max count where after that many write operations, the API will finish the transaction and start a new one. I have a few questions around this. 1) Can I ignore reads? If I write a few nodes within a transaction, can I then read indefinitely, or will the fact that I have an open transaction cause neo to consume more memory until the transaction is finished. 2) Is there any guideline for the relative amounts of memory various operations take? (Writing a node, writing an edge, writing a property, and so on?) Should I bump my counter once for each of these? 3) Since the API will operate in a multi-user environment, is a per-user count a bad idea? Should I maintain a user count and a global count and adjust the user limit based upon the number of concurrent users? Or should I monitor available free memory instead of, or in addition to maintaining this counter? Any other suggestions? Thanks in advance! -Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Best way to visualize?
There is also Jung. http://jung.sourceforge.net/ -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Jeff Klann Sent: Thursday, August 05, 2010 3:15 PM To: Neo4j user discussions Subject: [Neo4j] Best way to visualize? Hi all, there have been snippets of discussion on this but not a full answer. I'm looking for the best way to quickly pull up part of my graph to visualize. Optimally I'd like to visualize programmatically - e.g., write a little code that would throw up some subset of nodes and edges on the screen in a nicely laid-out way. The tools I've found: * Neoclipse: Seems super-slow and I can't look up a node either in an index or by ID, which makes it basically unusable for me. The search function, whatever it does, just runs and runs. * iGraph: Looks like a cool tool, but I can't figure out a way to interact with it in the JVM. Couldn't even figure out how to install it in Jython. * Gephi: Way too buggy. Looks like it will be cool eventually. The alpha version with the half-finished Neo4J import didn't get all my edges properly, and even after that most things worked erratically. The output looks awesome, though. * Cytoscape: Actually the best thing I've come across, though a little heavyweight for just throwing some stuff on the screen and I haven't learned the API well enough to get it to interact directly with my Neo4J code. It wasn't too hard to export part of my graph to XGMML via the Cytoscape API and then import it into the Cytoscape GUI, though it does seem an unnecessary extra step. So can anyone help with: - Neoclipse, can I look up a node by index value or at least by ID? - iGraph, how to integrate with the JVM? - Cytoscape, how can I tighten my integration (rather than through XGMML)? - Is there a tool I've missed? Thanks! Jeff Klann ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TraversalDescription building hickup
I may have missed your point. But, FWIW, this model reflects what I would expect from an immutable object. For example: String s = Test; s.replace('T', 't'); // s still contains Test BigInteger and Date are the same way. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Mattias Persson Sent: Tuesday, July 27, 2010 2:41 PM To: Neo4j user discussions Subject: Re: [Neo4j] TraversalDescription building hickup 2010/7/27 Peter Neubauer peter.neuba...@neotechnology.com Hi all, I just stumbled over the immutable TraversalDescription API (http://components.neo4j.org/neo4j-kernel/apidocs/index.html), which will not modify the object if you do TraversalDescription td = new TraversalDescriptionImpl(); td.depthFirst(); Instead, one needs to reassign td, like TraversalDescription td = new TraversalDescriptionImpl(); td = td.depthFirst(); However, TraversalDescription td = new TraversalDescriptionImpl().depthFirst(); will give you the expected td. IMHO this is unexpected behaviour and hard to get if you just follow the common fluent API and presume a Builder-pattern. Especially since no errors are thrown and you just end up with strange results and unreachable code i e.g. a custom PruneEvaluator etc. True, the API says it is immutable, but still I think this is hard. WDYT? Should we think of changing this to a proper builder.modify().modify etc and finally builder.build() wich gives you the final, immutable instance of TraversalDescription and is clearly understandable by clients? I still think the current approach is more useful (although it'd be nice with more input on this). One reason I think it's better is that you can half-bake descriptions as private static final or similar and then complete the descriptions in several different places in your code. You can even pass in descriptions in methods and what not, without any risc of them being modified. I think javadoc should better explain this and it should be expected that developers read javadoc, right? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphML Nested Graphs
I think I found my answer in http://graphml.graphdrawing.org/primer/graphml-primer.html#Nested The edges between two nodes in a nested graph have to be declared in a graph, which is an ancestor of both nodes in the hierarchy. Note that this is true for our example. Declaring the edge between node n6::n1 and node n4::n0::n0 inside graph n6::n0 would be wrong while declaring it in graph G would be correct. A good policy is to place the edges at the least common ancestor of the nodes in the hierarchy, or at the top level. -Paul From: Paul A. Jackson Sent: Monday, July 26, 2010 2:52 PM To: 'Neo4j user discussions' Subject: GraphML Nested Graphs I am looking into nested graphs and have not found an answer to a specific case. Generally, when a node from one level links to a node in a sub graph, the edge should be defined in the outer graph. In the case where two nodes are in two different (peer) subgraphs at the same level, should the edge go in the outer level that contains them both? See edge e4 below. ?xml version=1.0 encoding=UTF-8? graphml xmlns=http://graphml.graphdrawing.org/xmlns; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd; graph id=G edgedefault=undirected node id=n0/ graph id=n5: edgedefault=undirected node id=n5::n1/ node id=n5::n2/ edge id=e0 source=n5::n1 target=n5::n2/ /graph /node node id=n1 graph id=n6: edgedefault=undirected edge id=e1 source=n6::n1 target=n6::n2/ /graph /node edge id=e2 source=n5::n2 target=n0/ edge id=e3 source=n0 target=n2/ edge id=e4 source=n6::n1 target=n5::n2/ /graph /graphml Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Problem-Solving with Graph Traversals (Presentation)
I would benefit from such a primer. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez Sent: Monday, July 26, 2010 9:52 AM To: Neo4j user discussions Subject: Re: [Neo4j] Problem-Solving with Graph Traversals (Presentation) Hello, The slides would be clearer for me if each expression was paraphrased in English (e.g. i ∈ V - i is a member of the set V), and I think it'd help other graph theory newbies too. Computer programming has a direct mapping to mathematical notation. For example: i ∈ V is equivalent to: % set up V = new HashSet(); V.add(i); % equivalence assertTrue(V.contains(i)) What I could do is write a short article entitled, something along the lines of, The Symbols, Syntax, and Semantics of Mathematics and Computation. At which point, I can establish the mapping between mathematical notation and programming statements. Thoughts?, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphML Nested Graphs
At the moment I am working on a writeGraphML(ListString groupBy) where each groupBy is a property key that may be assigned to the nodes. Nodes with null values for these keys would go in the top-level graph. So in short, I am trying to support this without changing the graph schema. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez Sent: Monday, July 26, 2010 3:46 PM To: Neo4j user discussions Subject: Re: [Neo4j] GraphML Nested Graphs Hey, Are you modeling a nested graph in Neo4j? If so, what is your pattern? Thanks, Marko. On Jul 26, 2010, at 12:52 PM, Paul A. Jackson wrote: I am looking into nested graphs and have not found an answer to a specific case. Generally, when a node from one level links to a node in a sub graph, the edge should be defined in the outer graph. In the case where two nodes are in two different (peer) subgraphs at the same level, should the edge go in the outer level that contains them both? See edge e4 below. ?xml version=1.0 encoding=UTF-8? graphml xmlns=http://graphml.graphdrawing.org/xmlns; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd; graph id=G edgedefault=undirected node id=n0/ graph id=n5: edgedefault=undirected node id=n5::n1/ node id=n5::n2/ edge id=e0 source=n5::n1 target=n5::n2/ /graph /node node id=n1 graph id=n6: edgedefault=undirected edge id=e1 source=n6::n1 target=n6::n2/ /graph /node edge id=e2 source=n5::n2 target=n0/ edge id=e3 source=n0 target=n2/ edge id=e4 source=n6::n1 target=n5::n2/ /graph /graphml Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Subgraphs
All, We are considering leveraging the concept of subgraphs as an approach for interactively visualizing large quantities of data. The idea is that by aggregating collections of nodes that share some characteristic into a single node, we may be able to reduce a graph's complexity to the point where an entire graph can be displayed on screen and convey useful information to a user. At that point, specific nodes could be expanded to show additional detail. In theory, these subgraphs could be nested for multiple levels. GraphML already has support for this idea, but not in a way that reduces bandwidth or memory requirements. I think we would prefer to leave the detail of the subgraph out of the xml until such time that a user requests the detail. Otherwise, we would still suffer from sluggish performance and memory constraints. My questions for the group are: 1) Are there any suggestions for how subgraphs might be algorithmically determined? Are there clustering algorithms that might be leveraged? 2) Is there a recommended way for storing subgraphs in neo4j? Should additional nodes for the subgraphs be inserted into the graph, with links from the contained nodes back to the subgraph nodes, or are they ways to do this that do not involve altering the graph schema; perhaps an attribute-based approach? 3) Any other idea on this train of thought? I do not have a specific use case in mind. I am hoping to identify an approach that could be applied to large numbers of nodes in general. By large graph, I mean one million nodes or more and by manageable graph I mean less than one thousand nodes. Thanks in advance, -Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] graph-matching from web application
It seems a query optimizer would be of use in this case. If you are looking for A-Rel1-B-Rel2-C, it would be helpful to know what the frequency of A, B, C, (and possibly Rel1, and Rel2 if relationships are indexed in the future) and start your traversal with whichever set of nodes is least common. This is an interesting problem. If you could represent a molecule canonically in text (so that a given molecule would always be represented in text the exact same way), then it would seem a text search would be the way to go (eliminating the need for the graph altogether). -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Jonathan Marten Sent: Wednesday, July 21, 2010 4:45 AM To: user@lists.neo4j.org Subject: [Neo4j] graph-matching from web application Hi David, thanks a lot for your answers! They were very helpful. I'm not sure I understand your setup. Could you describe a, b and c in more detail? What do you mean by a subgraph in this case? What makes it a subgraph, i.e. what is the greater graph? My setup is similar to someone looking for a chemical molecule. My database would hold the structure of 200 million molecules and the user wants get information on every molecule that contains a certain structure. He can construct this structure using a html form and then we search in neo4j for all matching molecules (i.e. all that contain a CH2-CH=O or whatever people can think of). Neo4j returns the IDs of these molecules and then we use our existing Perl/PHP Scripts to retrieve more information from the relational database, visualize, and so on. Well, this depends on how you roll it. If you have a separate database, then you will have to access it via e.g. REST or using the remote graph db API. But you can also have it embedded in your application, running in the webapp. But you might not be using a Java webapp? Right, I'm not using a Java webapp. So my solution will probably be to implement a simple multi-threaded server in Java (for instance like the one at the end of this page: http://download.oracle.com/docs/cd/E17409_01/javase/tutorial/networking/sockets/clientServer.html) and then query that server from CGI-scripts on the webserver running the web application. I don't understand what you mean. Please clarify. For example, you can't attach properties to the graph. Only to nodes and relationships. That was exactly what I meant. I wanted to store the ID only once per molecule and not on every node. But to answer your questions: I think you always need to do matching starting from a node. You can match subgraphs with properties using the addPropertyConstraint method on PatternNode and PatternRelationship. You can match relationships too using the PatternRelationship class. Thanks a lot, I thought it would be like that from reading the documentation, but I wasn't sure. What I wanted to do is sort of abstract matching, i.e. I wanted to retrieve a structure like N--rel1--N--rel2--N anywhere in the database without knowing anything about the nodes and only knowing the relationship types. I have found a way of doing something like that by changing my database design. It will not be very efficient, but it should work. Thanks again for your help, I think I know what to do now. Best regards, Jonathan -- Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] OutOfMemory while populating large graph
I confess I had not investigated the batch inserter. From the description it fits my requirements exactly. With respect to auto-commits, it seems there are two use cases. The first is every day operations that might run out of memory. In this case it might be nice for neo4j to swap out memory to temporary disk as needed. If this performs acceptably, I think that should be default behavior. The second case is the initial population of a graph, where there is no need for roll back and so there is no need to commit to a temporary location. In this case, it seems having neo4j decide when to commit would be ideal. My concern with the first use case is that swapping to temporary storage at ideal intervals may be less efficient than having the user commit to permanent storage at less-than-ideal intervals. If that is the case, then the only real justification for committing to temporary storage would be if there was a requirement to potentially roll back a transaction that was larger than memory could support. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Mattias Persson Sent: Friday, July 09, 2010 7:30 AM To: Neo4j user discussions Subject: Re: [Neo4j] OutOfMemory while populating large graph 2010/7/9 Marko Rodriguez okramma...@gmail.com Hi, Would it actually be worth something to be able to begin a transaction which auto-committs stuff every X write operation, like a batch inserter mode which can be used in normal EmbeddedGraphDatabase? Kind of like: graphDb.beginTx( Mode.BATCH_INSERT ) ...so that you can start such a transaction and then just insert data without having to care about restarting it now and then? Thats cool! Does that already exist? In my code (like others on the list it seems) I have a counter++ that every 20,000 inserts (some made up number that is not going to throw an OutOfMemory) commits and the reopens a new transaction. Sorta sux. No it doesn't, I just wrote stuff which I though someone could think of as useful. A cool thing with just telling it to do a batch insert mode transaction (not the actual commit interval) is that it could look at how much memory it had to play around with and commit whenever it would be the most efficient, even having the ability to change the limit on the fly if the memory suddenly ran out. Thanks, Marko. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] OutOfMemory while populating large graph
I have seen people discuss committing transactions after some microbatch of a few hundred records, but I thought this was optional. I thought Neo4J would automatically write out to disk as memory became full. Well, I encountered an OOM and want to make sure that I understand the reason. Was my understanding incorrect, or is there a parameter that I need to set to some limit, or is the problem them I am indexing as I go. The stack trace, FWIW, is: Exception in thread main java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.init(HashMap.java:209) at java.util.HashSet.init(HashSet.java:86) at org.neo4j.index.lucene.LuceneTransaction$TxCache.add(LuceneTransaction.java:334) at org.neo4j.index.lucene.LuceneTransaction.insert(LuceneTransaction.java:93) at org.neo4j.index.lucene.LuceneTransaction.index(LuceneTransaction.java:59) at org.neo4j.index.lucene.LuceneXaConnection.index(LuceneXaConnection.java:94) at org.neo4j.index.lucene.LuceneIndexService.indexThisTx(LuceneIndexService.java:220) at org.neo4j.index.impl.GenericIndexService.index(GenericIndexService.java:54) at org.neo4j.index.lucene.LuceneIndexService.index(LuceneIndexService.java:209) at JiraLoader$JiraExtractor$Item.setNodeProperty(JiraLoader.java:321) at JiraLoader$JiraExtractor$Item.updateGraph(JiraLoader.java:240) Thanks, Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Query for combination of properties
Update - I added this and was satisfied with the results: private void commitIfNecessary() { if (transactions++ = txLimit) { tx.success(); System.out.println(Committing + (transactions - 1) + records to graph...); tx.finish(); tx = databaseService.beginTx(); transactions = 0; } } -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Balazs E. Pataki Sent: Thursday, July 08, 2010 10:19 AM To: Neo4j user discussions Subject: Re: [Neo4j] Query for combination of properties A native solution would be also fine. This would practically allow, what is not really possible with the current relationship lookup implementation: to have really hundred thousands or millions of relationships to a Node and still be able to select relationships in a random access manner by some parameters (eg. relationship type, but maybe other properties as well). Would such native indexing require modifications to the current database file format, or it could be implemented as an additional service? --- balazs On 7/8/10 4:11 PM, Mattias Persson wrote: No, (lucene) indexing won't be implemented into getRelationships (it would totally break performance). However there are possibilities to create some other type of indexing (on relationship type for example/direction) natively. 2010/7/8 Balazs E. Patakipat...@dsd.sztaki.hu Great, thanks! Do you have any info on when 1.1 is expected? In the meantime we will use this laboratory version of the LuceneIndexProvider, because the multi-field search is essential in our case. By the way: I see that now one can also index relationships with the new API. Do you also plan to use these relationship indexes to make Node#getRelationships() and similar functions faster? So far it seems they look up relationships sequentially, which is pretty bad when you want too look for a specific type of relationships among 10.000 others. (OK, it is more of a problem with 1 million relationships, but anyway, I'm just curious ;-) ) --- balazs On 7/8/10 3:21 PM, Mattias Persson wrote: Yeah, that API isn't stable yet, but I think that it will end up similar to that... and hopefully merged into kernel trunk after 1.1 sometime. You can use it for fun, but you should expect changes in it. 2010/7/7 Peter Neubauerpeter.neuba...@neotechnology.com Balazs, Mattias is writing this component, not sure how stable it is right now, but as I perceived it the API is starting to settle ... Would be great to get some more indexes tried out, feel free to experiment with Sphinx, might be a good alternative to Lucene? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 7, 2010 at 6:07 PM, Balazs E. Patakipat...@dsd.sztaki.hu wrote: That's great, works as expected. :-) Now, it seems you changed a lot of the indexing APIs. Should I use these new ones (and the neo4j sources from the SVN trunk), as these will be used in future versions, or these are still experimental? I ask this because in parallel we also investigate the possibility of integrating the shynx indexer (http://www.sphinxsearch.com/) to neo4j. If there's any experience or plans regarding sphynx, I would appreciate any info about it. Thanks again, --- balazs On 7/7/10 3:40 PM, Peter Neubauer wrote: Balazs, this is not explicitly possible today, but in the new Lucene-Index component in laboratory that will be integrated into trunk after Neo4j 1.1, see https://svn.neo4j.org/laboratory/components/lucene-index/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java , method makeSureCompositeQueriesCanBeAsked . Sorry for the inconvenience! You could try out the component and let us know if that works for you? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 7, 2010 at 3:12 PM, Balazs E. Patakipat...@dsd.sztaki.hu wrote: Toni, thanks for the hints! Here's my actual use case: I have Nodes storing texts of various languages. The Nodes have 2 properties: content: the actual text language: ISO language code of the text (eng, ger, hun, etc) I would like to search for Nodes containing a specific text in content having a specific
[Neo4j] Identifying similar nodes in a 2-mode network
Hi All, I am wondering if there is an algorithm that can identify nodes that are similar based upon there relationships to other nodes of a different type. For example, if I have a graph of people and items purchased, I would like to be able to identify people with similar buying habits. I think this is different from cliques in that the similar nodes may not know each other. A popular application would be how Netflix can retrieve a list of users with similar tastes to your own, based upon your movie ratings (a different type of node), as opposed to how Facebook suggest friends base upon mutual friends (similar types of nodes). I suppose there are at least two main approaches to the solution; one where no preprocessing to the data is performed but instead a query returns the most similar node by traversing the graph starting from the input node; and another where somehow an algorithm maps these nodes into some (2-dimensional?) space similar to how layout algorithms work, and then the query would for similar nodes would be a spatial query where the distance to the input node equates to degree of similarity. If seems to me like this concept would be prevalent in document management. Am I on the right track? Are there algorithms or other research for this out there? Any suggestions on how to structure a graph to support this type of query? Thanks, Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Lucene Index on Relationships
I am not sure was a per-node relationship index is. I concur that a relationship index doesn't help if each node has a relationship of the type we are interested in (like in a graph of employees, each employee would have a Manager relation). However, in a graph where there are lots of nodes and only a few of them have a relationship of the type we are interested in, it seems logical to me that the optimal way to start a query is with the index into the relationships. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Mattias Persson Sent: Monday, June 21, 2010 8:36 AM To: Neo4j user discussions Subject: Re: [Neo4j] Lucene Index on Relationships Hi, how do you guys expect indexing for relationships to work? Would it be an index just as for nodes... or per node? I often hear that it'd speed up traversals if a node has many, many neighbours. But if the relationship index would be for the entire graph (not per node) that wouldn't really help, would it? 2010/6/21 Craig Taverner cr...@amanzi.com: A side comment, since I think indexing relationships with lucene might be good, but think there might be alternatives for your current example. You said that the relationship property is a float from 0 to 1, so you cannot use relationship types, but actually, when you consider that any index is usually created by breaking data ranges (continuous or discrete) into fewer, more discrete ranges, you can use a relationship type to represent a range of floats. For example, if you have roughly even distribution of floats between 0 and 1, try divide that into 100 parts (0%-100%, or 0.01 to 1.00), and make a relationship type for each. This would certainly facilitate traversing relationships of specific float values (at least improve the performance dramatically, as in an index). Of course, this example focuses on traversing from a particular document. If you are searching for all relationships in the entire database with particular float values, then a separate index would be better. On Mon, Jun 21, 2010 at 2:11 PM, Marius Kubatz marius.kub...@udo.eduwrote: Hello guys, hello community! I'm currently evaluating neo4j for my thesis and have a wish :) I have already opened a ticket for this,( https://trac.neo4j.org/ticket/241 ) but I would like to hear what you guys think about it. Basically it just involves the ability to index Neo4j Relationships with Lucene Index. Neo4j works great on sparse graphs, but what happens when you have a very tight graph with several thousands of neighbors to one node? Additionally as soon as you store informations on Relationships you will get into trouble, because you will have to iterate through all those edges to find the properties you seek. If this sounds far fetched please take a look at this example where one might need properties on Relationships: One Document node is related to another Document node by a similarity function which is stored in the Relationships between those document nodes. Lets just say that we save a float between [0 - 1] on those relationships, which makes it impossible to create RelationshipTypes for every value. Using Index to fetch Relationships by their indexed properties would greatly speed up the process and increase the attractiveness of using properties on Relationships. I would love to have quick access to Relationship properties where I could add and implement fuzzy logic, probabilities, Bayesian networks, similarities, ranking ... and so on ... As said thank you for Relationship properties, they are great and already there, but what I miss is quick access to them. Thank you very much and best regards! Marius -- Programs must be written for people to read, and only incidentally for machines to execute. - Abelson Sussman, SICP, preface to the first edition ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Algorithms Best Practices
Hi, I am interested in providing network metrics such as centrality, eigenvector centrality, degree, etc to graphs that I must assume will contain lots (millions+) of nodes. I am interested in any suggestions regarding the best way to approach this: - Is it reasonable to add these metrics as properties of the nodes? My thought here is that this would work nicely when exporting the graph as GraphML. - Can these metrics be maintained in the database over time, or should they be calculated as needed? - Does the calculation of a metric for a single node require traversing the entire graph (or at least the sub-graph it is connected to)? Does it depend on the metric being calculated? - If the answer is, yes - it take a long time to update a set of metrics, what are the typical solutions? Do we go down a path like we do with data warehousing where the graph is loaded from the operational store periodically in batches, and then becomes stale over time? What might be some solutions for graphs that are constantly updated - or is the tradeoff simply that to have metrics your entire graph must be updated after any update for the metrics to be valid? (For example - can a node be time stamped or something, or is it the case that any change to the graph can change the metrics for every other node?) Thanks in advance. -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Exporting a Neo4J graph to RDF/XML
Disclaimer: I am new to this (but am committed to working the problem). I am interested in exporting my neo4j graphs to any of the supported xml formats (n3, turtle, etc.). I am interested in this because I am assuming that doing so will increase the interoperability of my graph database. Having spent some time on this I am beginning to question that assumption (because I perceive that RDF has restrictions that my graphs do not - requirements for URIs, each must be connected to something, etc.) I am interested in feedback on that assumption. That said, I am having a more immediate problem just getting the data into an RDF format. I have tried using neo4j-rdf-sail-sesame but didn't find much to be gained beyond the RDF layer. The closest I was able to achieve this was the following code, which does not work, I think because RDFStore.getStatements is not implemented for cases where both the subject and object are wildcards - it returns an iterable to an empty set. graph = new EmbeddedGraphDatabase(physicalPath); indexService = new LuceneIndexService(graph); rdfStore = new VerboseQuadStore(graph, indexService); ByteArrayOutputStream bos = new ByteArrayOutputStream(); RDFWriter rdfWriter = Rio.createWriter(RioFormat.getRDFFormat(ExportFormat.TURTLE), bos); WildcardStatement wildcardStatement = new WildcardStatement(new Wildcard(?s), new Wildcard(?p), new Wildcard(?o), new Wildcard(?g)); IterableCompleteStatement statements = rdfStore.getStatements(wildcardStatement, false); Transaction tx = graph.beginTx(); try { rdfWriter.startRDF(); for (CompleteStatement neoStatement : statements) { Statement sailStatement = GraphDatabaseSesameMapper.createStatement(neoStatement, true); rdfWriter.handleStatement(sailStatement); } rdfWriter.endRDF(); } catch (RDFHandlerException e) { tx.failure(); throw new GraphException(Exception thrown while exporting graph database + getName(), e); } finally { tx.finish(); } String result = bos.toString(); What is the best/easiest/recommended way to do this? Is it an edge case that most people wouldn't want to do - dumping the whole db to xml? Is there an approach that doesn't start with a query (like something that uses graph.listAllNodes())? On a side note, I was able to get GraphML working using the 0.1.2-SNAPSHOT version of tinkerpop's blueprints package. Thanks in advance. -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Exporting a Neo4J graph to RDF/XML
Hi Marko, I have not been using the Neo4jSail client. I think my confusion stemmed from an assumption that the mere line new VerboseQuadStore(graph, indexService) was sufficient to perform the transformation. I think I am starting to understand but still have some questions. What I understand is that a neo4j graph's properties will not be visible when viewed from as an RDF store. I also see that all nodes must have a uri property defined. If I use the base neo4j api, but I ensure all nodes have a uri property, should that suffice? Or must I do all input via the RDF interface. I still don't see how to execute an RDF query to dump the entire store no matter how I load the data - the getStatements call wants a non-wildcard in either subject or predicate. Thanks. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez Sent: Friday, June 04, 2010 1:46 PM To: Neo4j user discussions Subject: Re: [Neo4j] Exporting a Neo4J graph to RDF/XML Hi Paul, Are you storing your data in Neo4j using the Neo4jSail client? If so, then its very easy to export RDF/XML (if yes, I can show you how to do it). If not, then you will have to come up with piece of code that transforms the Neo4j data model into RDF as Neo4j is a property graph and RDF is a edge-labeled graph. See: http://wiki.github.com/tinkerpop/blueprints/graph-morphisms There is no general purpose solution to map a property graph to a RDF graph. If what I'm saying doesn't make sense, I can expand on it. Hope that helps, Marko. http://markorodriguez.com http://tinkerpop.com On Jun 4, 2010, at 11:37 AM, Paul A. Jackson wrote: Thanks so much for the quick replies. I recognize both your names from my research the last few days. I was able to get the export to GraphML working using Blueprints and was happy with it. I was also hoping to support RDF - not that I have a specific reason, but as I am trying to build a robust commercial product (http://www.pbinsight.com/products/data-management/data-quality-and-enrichment/enterprise-data-quality), I was basically hoping to be able to check the box. Is it the case that an RDF export is a reasonable thing to offer, but just not as common/open/flexible, or are the problems I am running into the result of a fundamental problem with the idea? Thanks, -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez Sent: Friday, June 04, 2010 1:08 PM To: Neo4j user discussions Subject: Re: [Neo4j] Exporting a Neo4J graph to RDF/XML Hi, You can use Gremlin or just Blueprints [ http://blueprints.tinkerpop.com ] straight up. Its pretty straightforward: http://tinkerpop.com/maven2/com/tinkerpop/blueprints/0.1.1/api/ e.g. GraphMLReader.inputGraph(Graph graph, InputStream graphmlStream) GraphMLWriter.outputGraph(Graph graph, OutputStream graphmlStream) If you do use Gremlin, use Gremlin 0.2.2 in downloads [ http://github.com/tinkerpop/gremlin/downloads ] as the repository's code is being re-written right now for Gremlin 1.0 coming out soon. In other words, don't build from source right now. Good luck, Marko. http://tinkerpop.com http://markorodriguez.com On Jun 4, 2010, at 10:56 AM, Peter Neubauer wrote: Paul, I would recommend to export the graph to GraphML. Currently, the easiest way would be to have your graph opened through Gremlin, http://gremlin.tinkerpop.com: java -jar ~/code/gremlin/target/gremlin-xxx-standalone.jar \,,,/ (o o) -oOOo-(_)-oOOo- gremlin $_g := neo4j:open('path/to/neo4j-db') gremlin g:save('export.xml') This will give you export.xml as GraphML that you can load with gremlin g:load('export.xml') back into a new graph. Would that work? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Jun 4, 2010 at 5:26 PM, Paul A. Jackson paul.jack...@pb.com wrote: Disclaimer: I am new to this (but am committed to working the problem). I am interested in exporting my neo4j graphs to any of the supported xml formats (n3, turtle, etc.). I am interested in this because I am assuming that doing so will increase the interoperability of my graph database. Having spent some time on this I am beginning to question that assumption (because I perceive that RDF has restrictions that my graphs do not - requirements for URIs, each must be connected to something, etc.) I am interested in feedback on that assumption. That said, I am having a more immediate problem just