Re: [Neo] Community Program Review at FOSS4G 2010
voted for it 2010/5/4 Craig Taverner cr...@amanzi.com Hi guys, I've applied to present Neo4j Spatial (Neo4j as a true GIS database for mapping data) at the FOSS4G conference in September. To increase the chances of the presentation getting accepted, it helps to get community votes. So, if you think Neo4j Spatial is a cool idea, vote for it :-) Please follow this link to express your opinion: http://2010.foss4g.org/review/ Regards, Craig -- Forwarded message -- From: Lorenzo Becchi lbec...@osgeo.org Date: Wed, May 5, 2010 at 2:02 AM Subject: Community Program Review at FOSS4G 2010 To: Lorenzo Becchi lore...@ominiverdi.com I would like to personally thank you for submitting your abstract for FOSS4G 2010. Here below there's the message to promote the public review of the 360 abstracts we've received. I imagine you want your abstract to be voted and your community to support you. Please feel free to forward this message to as much people as possible to make this public review something really useful. best regards Lorenzo Becchi -- At FOSS4G 2010 the community and conference registrants will have an opportunity to read through and score potential presentations prior to the selection of the final conference program. There is enough room in the conference schedule for 120 presentations. The conference committee will use the aggregate scores from the community review process to help choose which presentations to accept, and to assign presentations to appropriately sized rooms. The top quoted presentations will receive a special attention from the organization. Please follow this link to express your opinion: http://2010.foss4g.org/review/ ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Evaluating Neo4J as an enterprise class application
We have stress tested neo4j with over 500 concurrent users in a webapp with a smaller dataset and we found no performance issues. We even wrap their api in a domain layer that adds some extra overhead. One thing to keep in mind is that if your data ever grows to a point where it needs to be distributed among machines you can't do that with the free version of neo but I think they support it with one of the commercial licenses. In my experience so far with Neo +5 months, since it is embedded if you use java you get a much better experience than using any relational db with a orm layer such as hibernate. The data is not transported from your db to a resulset then to a pojo then tou your view objects. With neo the data may be in memory when you request it and there is no jdbc layer in between your code and the graph. We have also purposedly crashed the JVM and app hoping that at some point it will corrupt the graph and we have been doing this repeteadly at least 10 times a day for the last 5 months. It has always recovered and completed queued transactions. So far we have not been able to corrupt the graph or bring it down. Backing up the data is also easy as a copy of the graph folder is all you need. PROS - Fast - Easy api - Reliable - High Performance in our use cases - The Neo team is fast answering doubts and questions - No SQL - Fast relationship traversals, in the relational world this usually means JOINs which are not very scalable. - Ideal for scenarios where there are multiple relationships and interconnected objects CONS - Free version is non distributable in multiple machines - Only one process or JVM can access the graph at a time - No SQL (if you like sql) - Filtered traversals where results should be ordered usually require full scans / traversal then reorder results. This is not scalable when pagination is required and the results are millions. We have fixed this issue though by having a separate index for single ordereded relationships. In a nutshell this is what your typical relational db provides as a btree index of properties that allows you to query with order by fast. Neo at the time does not have that so you have to keep your own indexes if you want ordered traversals. (Not a trivial task to implement) 2010/5/2 suryadev vasudev suryadev.vasu...@gmail.com We are evaluating Neo4J for a business critical application. There will be a User Interface UI component to browse the graph, create nodes and properties as well as create/modify relationships. The data set spans across 7 domains and expected to be around 40 GB. User will manipulate data in 3 domains. A back end integration is expected to manage data in remaining 4 domains. I use the word domain to mean nodes/relationships/attributes that are grouped to perform one activity like Sale Order, Shipping, Distribution etc. The domains are related to each other and queries traverse across different domains We are expecting 500 users per hour to use the system. Each user may initiate a query once in 2 minutes. Each query is expected to traverse through 20,000 nodes and collect 10 properties for filtering/display. I am accountable for implementing this system. You probably know what accountable means:) Say it is related to Guillotine. What should I do to convince myself to move forward? Things that come to my mind are stability, scalability, auditing and monitoring. Stability means the JVM/application won't crash. Scalability means each user will get response in 1-2 second for up to 500 users. Auditing means the system reports its performance for all interactions. Monitoring means health and performance of the system are made visible. Comments and pointers to related articles are appreciated. TIA SDev ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Estrange exception when running multithreaded
That makes sense, I'll give it a try, thanks! 2010/2/26 Mattias Persson matt...@neotechnology.com: 2010/2/26 Raul Raja Martinez raulr...@gmail.com: Hi Mattias, Thanks for the docs. I'm trying to solve this issue now and here is the problem I'm facing... Where it says in the wiki: Rewrite your code, making sure that such scenarios won't happen. Run your deadlock-prone code in a try-catch(DeadlockDetectedException) block and just rerun the entire transaction if such an exception is caught. My transactions start and end in an interceptor that provides advice to method invokations. The invokations happen inside a thread pool concurrently. This is what the interceptor looks like. I have no way to recover from the deadlock. Are you implying in that advice that I should synchronize in the operation that is causing the write lock? Is there anyway to configure Neo to wait on deadlock for a release up to certain time? If I reissue a method call that is transactional other state not related to the neo transactions such as indexes, etc... may modify other state objects, I'd rather have the transaction wait for release and continue as other transactions are completing. To make the executing thread wait would result in a deadlock, that's why the exception is thrown so that isn't really an option. Regarding state: if you're referring to components in neo4j, they all handle state correctly if a transaction is rolled back (in this case where a DeadlockDetectedException is thrown)... even the IndexService and such components so that won't be a problem. Here is the code for the interceptor. The advised methods are Runnables that get executed async by a threadpool. /** * A method interceptor that provides transaction advice around a method invocation opening and closing a transaction accordingly */ public class Neo4JTransactionAdviceInterceptor implements MethodInterceptor { private final static Logger log = Logger.getLogger(Neo4JTransactionAdviceInterceptor.class); private GraphDatabaseService neoService; public void setNeoService(GraphDatabaseService neoService) { this.neoService = neoService; } /** * provides transaction advice around a method invocation opening and closing a transaction accordingly * * @param invocation the method invocation joinpoint * @return the result of the call to {...@link * org.aopalliance.intercept.Joinpoint#proceed()}, might be intercepted by the * interceptor. * @throws Throwable if the interceptors or the * target-object throws an exception. */ public Object invoke(MethodInvocation invocation) throws Throwable { Transaction tx = neoService.beginTx(); Object result = null; try { result = invocation.proceed(); tx.success(); } catch (DeadlockDetectedException e) { tx.failure(); log.debug(deadlock detected for invocation + invocation + result: + result + transaction: + tx); } catch (Throwable t) { tx.failure(); throw t; } finally { tx.finish(); } return result; } } How about making your code look something like (I removed unecessary tx.failure() calls, see http://wiki.neo4j.org/content/Transactions#Controlling_success): public Object invoke(MethodInvocation invocation) throws Throwable { Object result = null; for ( int i = 0; i 10; i++ ) { Transaction tx = neoService.beginTx(); try { result = invocation.proceed(); tx.success(); return result; } catch (DeadlockDetectedException e) { log.debug(deadlock detected for invocation + invocation + result: + result + transaction: + tx); } finally { tx.finish(); } } return result; } Where 10 is the number of retries to do before giving up. It may not look very elegant, but there's really no silver bullet for the the deadlock problem. -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Can you use EmbeddedGraphDatabase without a filesystem?
We setup an fs based graph on the jvm java.io.tmp folder and destroy on every setup() teardown() if necessary. This is how all of our unit tests run and it works fine. An in memory store may give you a false view on how things work if one of the things you are testing is performance, specially in the case of your production store being fs based 2010/2/26 Mattias Persson matt...@neotechnology.com: 2010/2/26 Hans Brattberg hans.brattb...@crisp.se: Hi! For test automation it could be useful to create an instance of EmbeddedGraphDatabase that don't use the file system, but only keep the data in memory, in the same way hsqldb can be configured. Is there another version of EmbeddedGraphDatabase for that purpose, or has any one a solution fort this? /Hans At the moment there's only the filesystem backend, but we're planning to add such an in-memory backend to have, as you say, for tests and such. One solution right now could be to create a file system in RAM, http://en.wikipedia.org/wiki/RAM_disk, but that would probably require you to delete/recreate the file system or use a new database folder name for each test... much like deleting your my/neo4j-db/ database folder before each test. So the problem wouldn't really go away. In most cases you can just have a MyFileUtils.deleteDirectory( my/neo4j-db ) before instantiating your EmbeddedGraphDatabase instance, but that's probably what you're trying to get away from :) ? -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Estrange exception when running multithreaded
Hi, I got this solved by following your suggestion and most important decreasing the corePoolSize in the Thread pool. Obviously the bigger the number of threads that are trying to run in parallel the more likely to deadlocks. What's most intriguing is that this same code without changes ran fine before we updated to 1.0. Anyway here is the updated interceptor... /** * A method interceptor that provides transaction advice around a method invocation opening and closing a transaction accordingly */ public class Neo4JTransactionAdviceInterceptor implements MethodInterceptor { private final static Logger log = Logger.getLogger(Neo4JTransactionAdviceInterceptor.class); private GraphDatabaseService neoService; public void setNeoService(GraphDatabaseService neoService) { this.neoService = neoService; } /** * provides transaction advice around a method invocation opening and closing a transaction accordingly * * @param invocation the method invocation joinpoint * @return the result of the call to {...@link * org.aopalliance.intercept.Joinpoint#proceed()}, might be intercepted by the * interceptor. * @throws Throwable if the interceptors or the * target-object throws an exception. */ public Object invoke(MethodInvocation invocation) throws Throwable { boolean deadlockDetected = false; int reattempts = 10; Object result = null; for (int i = 0; i reattempts; i++) { //try up to ten times if (deadlockDetected) { Thread.sleep(100); } Transaction tx = neoService.beginTx(); try { if (deadlockDetected) { //is this the right way to handle invoking the method? result = invocation.proceed(); } else { result = invocation.proceed(); } tx.success(); if (deadlockDetected) { log.debug(successfully recovered from deadlock); } return result; } catch (DeadlockDetectedException e) { deadlockDetected = true; log.debug(deadlock detected for invocation + invocation + attempt: + i); tx.failure(); if (i = reattempts) { log.error(permanent failure after attempt + i); } } finally { tx.finish(); } } return result; } } 2010/2/26 Raul Raja Martinez raulr...@gmail.com: That makes sense, I'll give it a try, thanks! 2010/2/26 Mattias Persson matt...@neotechnology.com: 2010/2/26 Raul Raja Martinez raulr...@gmail.com: Hi Mattias, Thanks for the docs. I'm trying to solve this issue now and here is the problem I'm facing... Where it says in the wiki: Rewrite your code, making sure that such scenarios won't happen. Run your deadlock-prone code in a try-catch(DeadlockDetectedException) block and just rerun the entire transaction if such an exception is caught. My transactions start and end in an interceptor that provides advice to method invokations. The invokations happen inside a thread pool concurrently. This is what the interceptor looks like. I have no way to recover from the deadlock. Are you implying in that advice that I should synchronize in the operation that is causing the write lock? Is there anyway to configure Neo to wait on deadlock for a release up to certain time? If I reissue a method call that is transactional other state not related to the neo transactions such as indexes, etc... may modify other state objects, I'd rather have the transaction wait for release and continue as other transactions are completing. To make the executing thread wait would result in a deadlock, that's why the exception is thrown so that isn't really an option. Regarding state: if you're referring to components in neo4j, they all handle state correctly if a transaction is rolled back (in this case where a DeadlockDetectedException is thrown)... even the IndexService and such components so that won't be a problem. Here is the code for the interceptor. The advised methods are Runnables that get executed async by a threadpool. /** * A method interceptor that provides transaction advice around a method invocation opening and closing a transaction accordingly */ public class Neo4JTransactionAdviceInterceptor implements MethodInterceptor { private final static Logger log = Logger.getLogger(Neo4JTransactionAdviceInterceptor.class); private GraphDatabaseService neoService; public void setNeoService(GraphDatabaseService neoService) { this.neoService = neoService; } /** * provides transaction advice around a method invocation opening and closing a transaction accordingly * * @param invocation the method invocation
Re: [Neo] Estrange exception when running multithreaded
Hi Mattias, Thanks for the docs. I'm trying to solve this issue now and here is the problem I'm facing... Where it says in the wiki: Rewrite your code, making sure that such scenarios won't happen. Run your deadlock-prone code in a try-catch(DeadlockDetectedException) block and just rerun the entire transaction if such an exception is caught. My transactions start and end in an interceptor that provides advice to method invokations. The invokations happen inside a thread pool concurrently. This is what the interceptor looks like. I have no way to recover from the deadlock. Are you implying in that advice that I should synchronize in the operation that is causing the write lock? Is there anyway to configure Neo to wait on deadlock for a release up to certain time? If I reissue a method call that is transactional other state not related to the neo transactions such as indexes, etc... may modify other state objects, I'd rather have the transaction wait for release and continue as other transactions are completing. Here is the code for the interceptor. The advised methods are Runnables that get executed async by a threadpool. /** * A method interceptor that provides transaction advice around a method invocation opening and closing a transaction accordingly */ public class Neo4JTransactionAdviceInterceptor implements MethodInterceptor { private final static Logger log = Logger.getLogger(Neo4JTransactionAdviceInterceptor.class); private GraphDatabaseService neoService; public void setNeoService(GraphDatabaseService neoService) { this.neoService = neoService; } /** * provides transaction advice around a method invocation opening and closing a transaction accordingly * * @param invocation the method invocation joinpoint * @return the result of the call to {...@link * org.aopalliance.intercept.Joinpoint#proceed()}, might be intercepted by the * interceptor. * @throws Throwable if the interceptors or the * target-object throws an exception. */ public Object invoke(MethodInvocation invocation) throws Throwable { Transaction tx = neoService.beginTx(); Object result = null; try { result = invocation.proceed(); tx.success(); } catch (DeadlockDetectedException e) { tx.failure(); log.debug(deadlock detected for invocation + invocation + result: + result + transaction: + tx); } catch (Throwable t) { tx.failure(); throw t; } finally { tx.finish(); } return result; } } 2010/2/22 Mattias Persson matt...@neotechnology.com: I wrote a reply to this, but decided to put it on the wiki instead... so head over to http://wiki.neo4j.org/content/Transactions#Deadlocks and read all about it :) 2010/2/22 Raul Raja Martinez raulr...@gmail.com: Hi, I have some code that runs in parallel through a Thread pool executor. I'm getting the following exception: publish exception: class org.neo4j.kernel.impl.transaction.DeadlockDetectedException : Transaction[Status=STATUS_ACTIVE,ResourceList=Xid[GlobalId[NEOKERNL|1266821950597|383448], BranchId[ 52 49 52 49 52 49 ]] xaresource[org.neo4j.kernel.impl.nioneo.xa.neostorexaconnection$neostorexaresou...@16107eff] Status[ENLISTED] can't wait on resource RWLock[NodeImpl#9058] since = Transaction[Status=STATUS_ACTIVE,ResourceList=Xid[GlobalId[NEOKERNL|1266821950597|383448], BranchId[ 52 49 52 49 52 49 ]] xaresource[org.neo4j.kernel.impl.nioneo.xa.neostorexaconnection$neostorexaresou...@16107eff] Status[ENLISTED] - RWLock[NodeImpl#187780] - Transaction[Status=STATUS_ACTIVE,ResourceList=Xid[GlobalId[NEOKERNL|1266821950597|383448], BranchId[ 52 49 52 49 52 49 ]] xaresource[org.neo4j.kernel.impl.nioneo.xa.neostorexaconnection$neostorexaresou...@16107eff] Status[ENLISTED] - RWLock[NodeImpl#9058] at com.cirqe.api.persistence.graph.impl.neo4j.Neo4JGraphNodeProxy.invoke(Neo4JGraphNodeProxy.java:127) I thought I'd post it since we have not seen this until we have upgraded to the latest version. Thanks in advance! Raul ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo4j Traverse API
If your resultset returns too many records is going to be slow sorting after retrieving the results. If you results have to be paginated in most cases you have to sort over the full resultset. Also if you plan to get things ordered by multiple properties you need some kind of btree structure. This is the kind of stuff that is easier to accomplish with most relational dbs since they keep track of indexes for you. After trying different approaches in our case it came down to keep ordered relationships asc and desc as their own relationship. We will be releasing in the next couple of months a Java Interface mapping to the neo store that uses java dynamic proxies that access the store lazyly and allows you to transparently keep track of ordered relationships behind the scenes by keeping results ordered at insertion time. Neo4j lacks the concept of indexes where you can setup indexes to get results ordered by arbitrary properties or relationships. AFAIK returning nodes ordered by properties in neo requires full traversal/scan then sorting on all the results. 2010/2/24 Satish Varma Dandu dsva...@gmail.com: Hi John, Thanks for the reply. Consider a scenario like LinkedIn: 1) I wanna search for all profiles in linkedin matching Neo4J 2) Now i get, lets say 20 people having Neo4J on their profiles. So far so good. But i wanna order these search results based on my order. Like first i wanna search results from my direct contacts followed by next order results. The worst case scenario is, once i get these search results, for each search result profile, i need to traverse find the path. But this take a lot of time if i get 2 many search results. So somehow i wanna combine both Lucene traverse. Is this doable with Neo4J? Hope i explained the problem. Any help would be great. Thanks, -Satish On Wed, Feb 24, 2010 at 1:29 AM, Johan Svensson jo...@neotechnology.comwrote: Hi, You can not make such an ordered search using the Lucene indexing service. You could try to only use a traverser instead of a Lucene search and let the traverser do the filtering. I am not sure I understand your problem completely. If you could describe the problem in more detail I am sure we can come up with a good solution for it. Regards, -Johan On Tue, Feb 23, 2010 at 11:00 PM, Satish Varma Dandu dsva...@gmail.com wrote: Hi, I am new to Neo4J, and so far it looks really good for traversing nodes. I have a question on using Traverser API. Can we order the lucene search results by degree wise. When i search for some data using lucene, i will get some nodes. Now i want to arrange those search results nodes in the order of level.i.e first i want results from my direct nodes then next level of nodes etc. Is this supported out of the box? Appreciate if you could point me to correct resource. Thanks Regards, -Satish ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Reset traverser iterator
Hi, I was wondering if it is possible to reset a traverser iterator. Looking at the Traverser interface comment it says... // Doc: especially remove() thing /** * Returns an {...@link Iterator} representing the traversal of the graph. The * iteration is completely lazy in that it will only traverse one step (to * the next hit) for every call to {...@code hasNext()}/{...@code next()}. * * Consecutive calls to this method will return the same instance. * * @return An iterator for this traverser */ // *TODO completely resolve issues regarding this (Iterable/Iterator ...)* *// Doc: does it create a new iterator or reuse the existing one? This is* *// very important! It must be re-use, how else would currentPosition()* *// make sense?* public IteratorNode iterator(); thanks ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Estrange exception when running multithreaded
Hi, I have some code that runs in parallel through a Thread pool executor. I'm getting the following exception: publish exception: class org.neo4j.kernel.impl.transaction.DeadlockDetectedException : Transaction[Status=STATUS_ACTIVE,ResourceList=Xid[GlobalId[NEOKERNL|1266821950597|383448], BranchId[ 52 49 52 49 52 49 ]] xaresource[org.neo4j.kernel.impl.nioneo.xa.neostorexaconnection$neostorexaresou...@16107eff] Status[ENLISTED] can't wait on resource RWLock[NodeImpl#9058] since = Transaction[Status=STATUS_ACTIVE,ResourceList=Xid[GlobalId[NEOKERNL|1266821950597|383448], BranchId[ 52 49 52 49 52 49 ]] xaresource[org.neo4j.kernel.impl.nioneo.xa.neostorexaconnection$neostorexaresou...@16107eff] Status[ENLISTED] - RWLock[NodeImpl#187780] - Transaction[Status=STATUS_ACTIVE,ResourceList=Xid[GlobalId[NEOKERNL|1266821950597|383448], BranchId[ 52 49 52 49 52 49 ]] xaresource[org.neo4j.kernel.impl.nioneo.xa.neostorexaconnection$neostorexaresou...@16107eff] Status[ENLISTED] - RWLock[NodeImpl#9058] at com.cirqe.api.persistence.graph.impl.neo4j.Neo4JGraphNodeProxy.invoke(Neo4JGraphNodeProxy.java:127) I thought I'd post it since we have not seen this until we have upgraded to the latest version. Thanks in advance! Raul ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] filtering content based on tags in multiple levels
Hi Sumanth, You can have all your questions and answers be nodes that are connected through relationships that are your tags. For any kind of filtering you can have Traversers with returnable evaluators that evaluate the kind of results you want back from you graph structure. So the basic answer is yes you can do that with neo4j and I believe in a easier, faster and more natural way that you would otherwise do in a relational database. 2010/2/17 Sumanth Thikka suma...@truesparrow.com Hi, Consider the following scenario: We have some questions and some tag(s) associated with each question. We have tags(lets say A, B, C, D, E etc) associated with some questions(just like in stackoverflow http://stackoverflow.com/). We should be able to filter all questions we have based on a tag(lets say it is A). Once we have all questions having tag A, other tags associated with this set of questions needs to be listed. Selecting(filtering) a tag(say B) from this second level tags should filter questions having the tags A and B. The filtered questions have tags A, B and some other possibly. The same above scenario should be served to any level of tags. Is it possible to achieve by using neo4j? If yes, how can we achieve this? Thanks in advance. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] neo4j performance testing
Hi DMitri, are you running JMeter distributed with many agents? Is the sample data sent to each servlet for each JMeter agent the same? Our experience with JMeter is that when running multiple threads on a single machine the numbers don't really give you a close grasp or reality. Also their http client seems slower than other clients like command line siege. 2010/2/11 Dmitri Livotov dmi...@livotov.eu We started initial complex tests and db performs really well at the moment. Very nice thing that it does not eat heap excessively as most java dataengines. Here is what the test data and test cases, please correct/append them if you think that something is not optimal, we'd like to make a complete real usecase test and share the results, so they could be useful for anyone else. 1) Test database We've imported the entire linux server filesystem structure with hierarchy relations, relations to file/folder classes (million relations to single node case), also with copying file/folder attributes into their nodes. The total size is 3M of nodes, with almost 10 attributes per every node. 2) Test environment neo4j database in put into a web application, initialized in context listener and then single database instance is requested by all test servlets. Web application will be deployed to glassfish v3 app server with a 2G heap allocated with a total of 4G memory on the physical server (centos 5.2, 64 bit kernel) There are several test servlets in the web application, each servlet, on a request, performs the single operation (several low level operations with the db are grouped) and responds with either 200 code or 500 code (on any exception) and zero response size. Test client is a apache jmeter instance, running on another desktop machine, tied with a 100M link with the server on the same subnet via 1G switch. 3) Test cases servlet 0 - simulates static web page to see how the db load affects the entire app server. Does almost nothing - makes a random delay from 10 to 200 ms and finishes. servlet 1 - performs random node lookup by a primary key, changes 10 node attributes and commits transaction servlet 2 - performs random node lookup by a primary key, then traversing the hieararchy until the end of graph (up and down), iterates it all in an empty loop. servlet 3 - long running one, requests all nodes (via traverse request from the top node), iterates them all and randomly adjust (add/remove/update) single attribute (no more than 100 attributes per cycle), then commits. Should we add more tests servlets you think can be useful for high load testing, maybe also you suggest to change/update the test data, say more depth levels, more relationships ? 4) Load factors servlet 0 is called from 100 parallel threads servlets 1 and 2 are called from 50 parallel threads (50 for each one, so 100 threads in total) servlet 3 is called from a single thread so we have 201 parallel threads in total. Thanks, Dmitri ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] neo4j performance testing
We have seen JMeter test highly impact the result after running 30+ threads, but it also depends on hardware. Somebody should come up with a distributed ec2 ami to run jmeter tests distributed where you just tell it I want to simulate load for 50.000 users and it creates ec2 instances that are tears down after the tests. Other thing to keep in mind is that to simulate load on the servlets you need to add ramp up time and think time for the users and let it warm up for a few mins if you're trying to measure performance on a warm cache 2010/2/11 Dmitri Livotov dmi...@livotov.eu Hi Raul, we plan to run multiple agents (on a final test) to reduce wrong timing effects from jmeter host, thanks for pointing to http client issue. Jmeter now has two built-in implementations - standrad and apache httpclient based clients, but it seems we'll need to write custom picker with siege as well to compare timings. I do not think we'll get the same number of agent hosts to run 1 thread per agent host, but will probably be running about 5-10 threads per agent. Do you think this will be enough or still too high for single jmeter host ? Probably, the better idea then will be to rent a required number of VPS'es and gen agent per thread environment (not sure, however, if this will be a not too expensive solution) ? Thanks for sharing your experience, Dmitri Raul Raja Martinez wrote: Hi DMitri, are you running JMeter distributed with many agents? Is the sample data sent to each servlet for each JMeter agent the same? Our experience with JMeter is that when running multiple threads on a single machine the numbers don't really give you a close grasp or reality. Also their http client seems slower than other clients like command line siege. 2010/2/11 Dmitri Livotov dmi...@livotov.eu We started initial complex tests and db performs really well at the moment. Very nice thing that it does not eat heap excessively as most java dataengines. Here is what the test data and test cases, please correct/append them if you think that something is not optimal, we'd like to make a complete real usecase test and share the results, so they could be useful for anyone else. 1) Test database We've imported the entire linux server filesystem structure with hierarchy relations, relations to file/folder classes (million relations to single node case), also with copying file/folder attributes into their nodes. The total size is 3M of nodes, with almost 10 attributes per every node. 2) Test environment neo4j database in put into a web application, initialized in context listener and then single database instance is requested by all test servlets. Web application will be deployed to glassfish v3 app server with a 2G heap allocated with a total of 4G memory on the physical server (centos 5.2, 64 bit kernel) There are several test servlets in the web application, each servlet, on a request, performs the single operation (several low level operations with the db are grouped) and responds with either 200 code or 500 code (on any exception) and zero response size. Test client is a apache jmeter instance, running on another desktop machine, tied with a 100M link with the server on the same subnet via 1G switch. 3) Test cases servlet 0 - simulates static web page to see how the db load affects the entire app server. Does almost nothing - makes a random delay from 10 to 200 ms and finishes. servlet 1 - performs random node lookup by a primary key, changes 10 node attributes and commits transaction servlet 2 - performs random node lookup by a primary key, then traversing the hieararchy until the end of graph (up and down), iterates it all in an empty loop. servlet 3 - long running one, requests all nodes (via traverse request from the top node), iterates them all and randomly adjust (add/remove/update) single attribute (no more than 100 attributes per cycle), then commits. Should we add more tests servlets you think can be useful for high load testing, maybe also you suggest to change/update the test data, say more depth levels, more relationships ? 4) Load factors servlet 0 is called from 100 parallel threads servlets 1 and 2 are called from 50 parallel threads (50 for each one, so 100 threads in total) servlet 3 is called from a single thread so we have 201 parallel threads in total. Thanks, Dmitri ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] bug in Node.delete() ?
I was thinking more in the line of node.delete() //does not delete if there are relationships node.delete(true) // deletes the node including relationships. But again no big deal, is already good the way it is 2010/2/10 Mattias Persson matt...@neotechnology.com: Hmm, maybe I didn't answer your question... you were wondering about automatically deleting all relationships on a node in Node.delete() right? Well, I don't know if Neo4j will support that or not, but we made the decision not to have it since it's good to be as explicit as possible so that potential code bugs aren't hidden, but instead surfaces as early as possible. 2010/2/10 Mattias Persson matt...@neotechnology.com: I don't think that's a very good idea to have as a default behaviour since potentially the entire graph is interconnected (and that it isn't just a tree structure), deleting one node could potentially delete the entire database. We could however have some kind of utility where you specify criterias to traverse and delete, but then again you could probably use the traverser api for that. Does that make enough sense to you? 2010/2/10, Raul Raja Martinez raulr...@gmail.com: Any plans to support cascade deletion? We currently have some cases where we have to manually iterate over the relationships and remove them. Not a big deal because we wrap many other operations but something that would nice to have out of the box. 2010/2/9 Mattias Persson matt...@neotechnology.com: Yep, that's right... The documentation is a bit unclear/untrue about that. Ryan is right in that it's only checked when the transaction is commited. This means that you can delete a node which has relationships on it, just as long as you delete its relationships before the transaction is committed. 2010/2/9, Ryan Levering rrlever...@gmail.com: Well, it's not totally untrue. Try changing your tx.failure() to tx.success() and you'll get the error you're looking for. Neo is a fairly transaction driven database and I imagine it doesn't check the constraints of the delete until it's committed. On Feb 9, 2010, at 3:28 PM, Stefan Armbruster wrote: Hi, it looks like Node.delete() does not behave as defined in the API docs: quote Deletes this node if it has no relationships attached to it. If delete() is invoked on a node with relationships, an unchecked exception will be raised. Invoking any methods on this node after delete() has returned is invalid and will lead to unspecified behavior. /quote I've created some sample code (http://pastebin.com/f2918f9d5) demonstrating this. Line 25 tries to delete a node with a relationship attached to it. There's no exception (as expected from the API docs) and the node is still there afterwards. So delete() seems to do simply nothing if there are any realtionships. Did I miss something or should we consider this a bug? Regards, Stefan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] bug in Node.delete() ?
Any plans to support cascade deletion? We currently have some cases where we have to manually iterate over the relationships and remove them. Not a big deal because we wrap many other operations but something that would nice to have out of the box. 2010/2/9 Mattias Persson matt...@neotechnology.com: Yep, that's right... The documentation is a bit unclear/untrue about that. Ryan is right in that it's only checked when the transaction is commited. This means that you can delete a node which has relationships on it, just as long as you delete its relationships before the transaction is committed. 2010/2/9, Ryan Levering rrlever...@gmail.com: Well, it's not totally untrue. Try changing your tx.failure() to tx.success() and you'll get the error you're looking for. Neo is a fairly transaction driven database and I imagine it doesn't check the constraints of the delete until it's committed. On Feb 9, 2010, at 3:28 PM, Stefan Armbruster wrote: Hi, it looks like Node.delete() does not behave as defined in the API docs: quote Deletes this node if it has no relationships attached to it. If delete() is invoked on a node with relationships, an unchecked exception will be raised. Invoking any methods on this node after delete() has returned is invalid and will lead to unspecified behavior. /quote I've created some sample code (http://pastebin.com/f2918f9d5) demonstrating this. Line 25 tries to delete a node with a relationship attached to it. There's no exception (as expected from the API docs) and the node is still there afterwards. So delete() seems to do simply nothing if there are any realtionships. Did I miss something or should we consider this a bug? Regards, Stefan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] org.neo4j.impl.transaction.TransactionFailureException: No mapping found for branchId
Thanks! 2010/2/7 Johan Svensson jo...@neotechnology.com Fix committed to kernel and index trunks. -Johan On Sat, Feb 6, 2010 at 10:33 PM, Raul Raja Martinez raulr...@gmail.com wrote: Yes, that's right, Thanks! 2010/2/6 Mattias Persson matt...@neotechnology.com Oh, I got it. You're using the LuceneFulltextIndexService right? It has that branchId. So this is a bug on our end and we'll fix it right away! 2010/2/5 Raul Raja Martinez raulr...@gmail.com: I'm not registering or configuring anything special as far as the NeoEmbedded store goes. We have been purposely crashing our app in the middle of transactions and killing the jvm processes to test data integrity under unexpected crashes. Despite this being a bug we are very happy with the outcome because as of today we have always been able to restart, and the worst case we have been in so far has been solved by removing the transactions logs. 2010/2/5 Johan Svensson jo...@neotechnology.com Hi, This could be a bug. Do you use the XA-framework (org.neo4j.kernel.impl.transaction.xaframework) to register any additional data sources? If not this is a bug since the global transaction log finds an entry with branch id 262374 and no data source has been registered for that branch id. Would be great if you could write a test case that triggers this bug or send me the active_tx_log together with tm_tx_log.1 tm_tx_log.2 logs next time it happens. Regards, -Johan On Thu, Feb 4, 2010 at 7:33 PM, Raul Raja Martinez raulr...@gmail.com wrote: I'm not sure if this is a known bug or it has already been fixed. I'm seeing this in 1.0-b11-SNAPSHOT. Sometime after killing an app running Neo I'm unable to restart the app as it always fails creating the EmbeddedNeo object. Caused by: org.neo4j.impl.transaction.TransactionFailureException: No mapping found for branchId[0x262374] at org.neo4j.impl.transaction.XaDataSourceManager.getXaResource(XaDataSourceManager.java:182) at org.neo4j.impl.transaction.TxManager.getXaResource(TxManager.java:867) at org.neo4j.impl.transaction.TxManager.buildRecoveryInfo(TxManager.java:385) at org.neo4j.impl.transaction.TxManager.recover(TxManager.java:231) at org.neo4j.impl.transaction.TxManager.init(TxManager.java:159) at org.neo4j.impl.transaction.TxModule.start(TxModule.java:79) at org.neo4j.api.core.NeoJvmInstance.start(NeoJvmInstance.java:154) at org.neo4j.api.core.NeoJvmInstance.start(NeoJvmInstance.java:64) at org.neo4j.api.core.EmbeddedNeoImpl.init(EmbeddedNeoImpl.java:68) at org.neo4j.api.core.EmbeddedNeo.init(EmbeddedNeo.java:49) A workaround is to manually remove the transactions logs active_tx_log tm_tx_log.1 tm_tx_log.2 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] org.neo4j.impl.transaction.TransactionFailureException: No mapping found for branchId
I'm not registering or configuring anything special as far as the NeoEmbedded store goes. We have been purposely crashing our app in the middle of transactions and killing the jvm processes to test data integrity under unexpected crashes. Despite this being a bug we are very happy with the outcome because as of today we have always been able to restart, and the worst case we have been in so far has been solved by removing the transactions logs. 2010/2/5 Johan Svensson jo...@neotechnology.com Hi, This could be a bug. Do you use the XA-framework (org.neo4j.kernel.impl.transaction.xaframework) to register any additional data sources? If not this is a bug since the global transaction log finds an entry with branch id 262374 and no data source has been registered for that branch id. Would be great if you could write a test case that triggers this bug or send me the active_tx_log together with tm_tx_log.1 tm_tx_log.2 logs next time it happens. Regards, -Johan On Thu, Feb 4, 2010 at 7:33 PM, Raul Raja Martinez raulr...@gmail.com wrote: I'm not sure if this is a known bug or it has already been fixed. I'm seeing this in 1.0-b11-SNAPSHOT. Sometime after killing an app running Neo I'm unable to restart the app as it always fails creating the EmbeddedNeo object. Caused by: org.neo4j.impl.transaction.TransactionFailureException: No mapping found for branchId[0x262374] at org.neo4j.impl.transaction.XaDataSourceManager.getXaResource(XaDataSourceManager.java:182) at org.neo4j.impl.transaction.TxManager.getXaResource(TxManager.java:867) at org.neo4j.impl.transaction.TxManager.buildRecoveryInfo(TxManager.java:385) at org.neo4j.impl.transaction.TxManager.recover(TxManager.java:231) at org.neo4j.impl.transaction.TxManager.init(TxManager.java:159) at org.neo4j.impl.transaction.TxModule.start(TxModule.java:79) at org.neo4j.api.core.NeoJvmInstance.start(NeoJvmInstance.java:154) at org.neo4j.api.core.NeoJvmInstance.start(NeoJvmInstance.java:64) at org.neo4j.api.core.EmbeddedNeoImpl.init(EmbeddedNeoImpl.java:68) at org.neo4j.api.core.EmbeddedNeo.init(EmbeddedNeo.java:49) A workaround is to manually remove the transactions logs active_tx_log tm_tx_log.1 tm_tx_log.2 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Strange behavior when calling getSingleRelationship : More than one relationship[DynamicRelationshipType[profile], OUTGOING]
Hi Tobias, Sounds good, thanks for the explanation. We're running 1.0-b11-SNAPSHOT. We will update on your next release. Thanks Raul 2010/2/3 Tobias Ivarsson tobias.ivars...@neotechnology.com: Hi Raul, Thanks for spotting this bug, not many people have. This is a cache bug that was found and fixed a while back (10 days or so). There has not been a new release since then, so if you are not running on the version from trunk that would explain it. If however you are running on the version in trunk that would mean that there are other causes of this problem that we have not found yet. So I guess my followup question to you is: Which version of Neo4j are you running? The video you sent is still interesting (you don't have to send it to me) I'll watch it with Peter when he gets into the office this morning. To answer another question you had: yes, it does make sense to be able to have multiple relationships of the same type with the same source node and target node, in the case of this bug however you end up with two copies (in the in-memory cache, not on disk) of the exact same relationship. Cheers, Tobias On Thu, Feb 4, 2010 at 12:18 AM, Raul Raja Martinez raulr...@gmail.comwrote: Here is a attached a video that shows a similar or the same problem. Calling getSingleRelationship results in a exception but after inspecting the variables call the relationships with that same type have the have the same relationship id. Does it make sense to have a two relationships of the same type attached to the same node with the same id? or is this a bug? 2010/2/3 Raul Raja Martinez raulr...@gmail.com: ...ThreadPoolTaskExecutor#7ec48b77-42 02/03 exception: class org.neo4j.api.core.NotFoundException : More than one relationship[DynamicRelationshipType[profile], OUTGOING] found for NodeImpl#56493... Strange behavior when calling getSingleRelationship This is a strange bug or some issue we're finding when running in separate threads. In some cases this exception is thrown even when the node has a single relationship after placing a debug point. In other cases after inspecting the node the is in fact two relationships of the same type pointing to the same end node but different node proxy instances. Has anybody seen anything similar? We assumed Nodes and their operations are thread safe so we have no particular synchronization around creating relationships. We have not been able to create a test case for this issue besides putting a debug point on that exception and let it run on thousands of records for a while with operations managed by a thread-pool. -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] org.neo4j.impl.transaction.TransactionFailureException: No mapping found for branchId
I'm not sure if this is a known bug or it has already been fixed. I'm seeing this in 1.0-b11-SNAPSHOT. Sometime after killing an app running Neo I'm unable to restart the app as it always fails creating the EmbeddedNeo object. Caused by: org.neo4j.impl.transaction.TransactionFailureException: No mapping found for branchId[0x262374] at org.neo4j.impl.transaction.XaDataSourceManager.getXaResource(XaDataSourceManager.java:182) at org.neo4j.impl.transaction.TxManager.getXaResource(TxManager.java:867) at org.neo4j.impl.transaction.TxManager.buildRecoveryInfo(TxManager.java:385) at org.neo4j.impl.transaction.TxManager.recover(TxManager.java:231) at org.neo4j.impl.transaction.TxManager.init(TxManager.java:159) at org.neo4j.impl.transaction.TxModule.start(TxModule.java:79) at org.neo4j.api.core.NeoJvmInstance.start(NeoJvmInstance.java:154) at org.neo4j.api.core.NeoJvmInstance.start(NeoJvmInstance.java:64) at org.neo4j.api.core.EmbeddedNeoImpl.init(EmbeddedNeoImpl.java:68) at org.neo4j.api.core.EmbeddedNeo.init(EmbeddedNeo.java:49) A workaround is to manually remove the transactions logs active_tx_log tm_tx_log.1 tm_tx_log.2 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Strange behavior when calling getSingleRelationship : More than one relationship[DynamicRelationshipType[profile], OUTGOING]
Here is a attached a video that shows a similar or the same problem. Calling getSingleRelationship results in a exception but after inspecting the variables call the relationships with that same type have the have the same relationship id. Does it make sense to have a two relationships of the same type attached to the same node with the same id? or is this a bug? 2010/2/3 Raul Raja Martinez raulr...@gmail.com: ...ThreadPoolTaskExecutor#7ec48b77-42 02/03 exception: class org.neo4j.api.core.NotFoundException : More than one relationship[DynamicRelationshipType[profile], OUTGOING] found for NodeImpl#56493... Strange behavior when calling getSingleRelationship This is a strange bug or some issue we're finding when running in separate threads. In some cases this exception is thrown even when the node has a single relationship after placing a debug point. In other cases after inspecting the node the is in fact two relationships of the same type pointing to the same end node but different node proxy instances. Has anybody seen anything similar? We assumed Nodes and their operations are thread safe so we have no particular synchronization around creating relationships. We have not been able to create a test case for this issue besides putting a debug point on that exception and let it run on thousands of records for a while with operations managed by a thread-pool. -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Checking whether a relationship exists between two nodes...
Same issue here. Would be nice to have something like node.hasRelationshipTo(node, relationship... relationships); 2010/2/2 Maria Giatsoglou mgiat...@csd.auth.gr: Hello all I am creating a project that performs a number of benchmark tasks on Neo. One of the tests measures the required time for creating a relationship between two neo nodes A and B. However, before creating the relationship, it should firs be checked whether a relationship of the same type already exists between these two nodes. My current implementation calls the getRelationships() function for the A node and then iterates over the returned Iterable object checking whether a relationship's end node is equal to node B. If such a relationship does not exist, then the required relationship is created between nodes A and B. However, this technique seems to be very slow, with the creation of a relationship (including the check operation) taking around 57msec to complete. Is there a faster way to implement this operation? I considered trying to modify the LuceneIndexService implementation in order to enable indexing relationships apart from nodes. Do you recommend such an approach for this problem? Many thanks in advance, Best regards, Maria Giatsoglou ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Transaction Event Listeners
Thanks, we'll look into the transaction manager. On Jan 18, 2010 1:21 AM, Mattias Persson matt...@neotechnology.com wrote: At the moment you cannot register listeners for committed nodes or stuff like that, but you could take a look at the TransactionEventManager in the utils component, http://components.neo4j.org/neo4j-utils/ . You can use it to register listeners for events that you send yourself. Your listeners will then get those events as an array after a successful commit. That's not really what you asked for though. Another thing would be to register synchronization hooks on the JTA transactions, however that way you'll only be notified before and after a transaction is committed, not _what_ was committed. If any of those suggestions would be useful I'd be happy to give an example of such code. 2010/1/14 Raul Raja Martinez raulr...@gmail.com: Hi Tobias, At this time we need... -... Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Nodes transactions, modifications and concurrency
What would be the outcome of two transactions committing changes to the same property in different threads? Given this scenario Thread#1 Starts transaction at 12:00 Changes property title on Node#1 to first Commits transaction at 12:04 Thread#2 Starts transaction at 12:01 Changes property title on Node#1 to second Commits transaction at 12:03 Would the property value at 12:05 for that node be first, second or would a exception be thrown because thread#1 acquired a write lock on that node and had not released it by the time Thread#2 starts or commits? In either case are the transaction lock policies something that can be configured in a per application basis? Thanks Raul ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Transaction Event Listeners
Hi Tobias, thanks for the info! I'm gonna consult with my team and we'll get back with some suggestions. 2010/1/14 Tobias Ivarsson tobias.ivars...@neotechnology.com: Hi Raul! Yes, Neo4j has such functionality, but we have not specified the final API for it yet. There is code implementing it from an old API that we used in house back when Neo4j was young. I don't know if all functionality of it is still hooked up though, and it is of course undocumented. If you look in the org.neo4j.kernel.impl.event package of the Neo4j kernel component there is the current code for it. It is highly volatile and likely only work with the current (1.0-rc) and next (1.0) release, since a public event framework is planned for the release after that (1.1). Since the proper event API isn't done yet, it is still possible to come with suggestions to how you would want it to work. Do you want events to be fired just before commit or just after? Or perhaps hooks for both? What data would you want access to in the event callback? Do you want a coarse grained events (commit, rollback) or fine grained events (node_created, node_property_set, relationship_created, ...)? Ideas are welcome! Cheers, Tobias On Thu, Jan 14, 2010 at 1:32 AM, Raul Raja Martinez raulr...@gmail.comwrote: Hi, Does neo4j provide some way for adding listeners to the transaction lifecycle? We would like to intercept when transacions are commited, created and rolledback to provide our own functionality for example ensure that a given node property is always set or some other properties are within a range. Most ORM solutions like hibernate allow devs to register to events in the system to which they can react when certain events are triggered. We need such functionality in general for Traversers, Transactions, Inserts, Updates, Cache etc... ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Transaction Event Listeners
Hi Tobias, At this time we need... - Before Transaction Commit - Dirty nodes - Properties - Relationships - After Transaction Commit - Committed Nodes - Properties - Relationships I can see how other events would be useful such as property set, removed, changed... I think I'd be great if none of these are actually evaluated, stored or processed unless there are listeners expecting to be notified. Also we need to understand the scope of getting notified by such events in the context of HA where the event may be trigger in a server in the cluster or in all of them at once if the transactions are distributed. Thanks! 2010/1/14 Raul Raja Martinez raulr...@gmail.com: Hi Tobias, thanks for the info! I'm gonna consult with my team and we'll get back with some suggestions. 2010/1/14 Tobias Ivarsson tobias.ivars...@neotechnology.com: Hi Raul! Yes, Neo4j has such functionality, but we have not specified the final API for it yet. There is code implementing it from an old API that we used in house back when Neo4j was young. I don't know if all functionality of it is still hooked up though, and it is of course undocumented. If you look in the org.neo4j.kernel.impl.event package of the Neo4j kernel component there is the current code for it. It is highly volatile and likely only work with the current (1.0-rc) and next (1.0) release, since a public event framework is planned for the release after that (1.1). Since the proper event API isn't done yet, it is still possible to come with suggestions to how you would want it to work. Do you want events to be fired just before commit or just after? Or perhaps hooks for both? What data would you want access to in the event callback? Do you want a coarse grained events (commit, rollback) or fine grained events (node_created, node_property_set, relationship_created, ...)? Ideas are welcome! Cheers, Tobias On Thu, Jan 14, 2010 at 1:32 AM, Raul Raja Martinez raulr...@gmail.com wrote: Hi, Does neo4j provide some way for adding listeners to the transaction lifecycle? We would like to intercept when transacions are commited, created and rolledback to provide our own functionality for example ensure that a given node property is always set or some other properties are within a range. Most ORM solutions like hibernate allow devs to register to events in the system to which they can react when certain events are triggered. We need such functionality in general for Traversers, Transactions, Inserts, Updates, Cache etc... ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Memory management in neo4j?
I think the problem is that everything is in a big transaction. Try splitting in smaller transactions On Jan 14, 2010 6:33 PM, Defenestrator defenestration...@gmail.com wrote: How is memory managed in neo4j, does it page data out to disk like all database systems? Here's a sample program that consumes about 2.1gb of memory on my system (8gb of total memory) before failing with an OutOfMemoryError. Exception in thread main java.lang.OutOfMemoryError: Java heap space at java.util.AbstractList.iterator(AbstractList.java:273) at org.neo4j.impl.transaction.TransactionImpl.doBeforeCompletion(TransactionImpl.java:336) at org.neo4j.impl.transaction.TxManager.rollback(TxManager.java:725) at org.neo4j.impl.transaction.TransactionImpl.rollback(TransactionImpl.java:114) at org.neo4j.api.core.EmbeddedNeoImpl$TransactionImpl.finish(EmbeddedNeoImpl.java:390) at test.GraphTester.main(GraphTester.java:61) Below is the program, which tries to create 100K user nodes and 5.5M relationships between those user nodes based on a Zipian distribution of number of relationships per node limited to 1000 per user node. It seems like this should work because this is well under billions of nodes/relationships. package test; import java.io.BufferedReader; import java.io.FileReader; import org.neo4j.api.core.EmbeddedNeo; import org.neo4j.api.core.NeoService; import org.neo4j.api.core.Node; import org.neo4j.api.core.Relationship; import org.neo4j.api.core.RelationshipType; import org.neo4j.api.core.Transaction; public final class GraphTester { public static void main(String[] args) throws Exception { int num_users = 10; NeoService neo = new EmbeddedNeo(var/neo); Transaction tx = neo.beginTx(); Node[] nodes = new Node[num_users]; try { for (int i = 0, n = nodes.length; i n; i++) { nodes[i] = neo.createNode(); nodes[i].setProperty(name, String.format(user%d, i)); if (i == 9) System.out.println(nodes[i].getId()); } FileReader fr = new FileReader(/home/defenestrator/dev/python/zipf/100kusers.csv); BufferedReader br = new BufferedReader(fr); int node_idx = 0; String s = null; while ((s = br.readLine()) != null) { int num_neighbors = Integer.parseInt(s); for(int i = 1; i = num_neighbors; i++) { int next_neighbor_idx = node_idx + (i * 7); // handle wrap-around if (next_neighbor_idx (num_users - 1)) next_neighbor_idx = next_neighbor_idx % num_users; nodes[node_idx].createRelationshipTo(nodes[next_neighbor_idx], MyRelationshipTypes.KNOWS); } node_idx++; } tx.success(); } finally { tx.finish(); } Node node1 = nodes[9]; System.out.println(node1.getProperty(name)); System.out.println(node1.getId()); IterableRelationship knows = node1.getRelationships(); for (Relationship know : knows) { Node friend_node = know.getOtherNode(node1); System.out.println(\t + friend_node.getProperty(name)); } neo.shutdown(); } private static enum MyRelationshipTypes implements RelationshipType { KNOWS } } ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Transaction Event Listeners
Hi, Does neo4j provide some way for adding listeners to the transaction lifecycle? We would like to intercept when transacions are commited, created and rolledback to provide our own functionality for example ensure that a given node property is always set or some other properties are within a range. Most ORM solutions like hibernate allow devs to register to events in the system to which they can react when certain events are triggered. We need such functionality in general for Traversers, Transactions, Inserts, Updates, Cache etc... ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Abstract unit test cases
Thanks, that is helpful! 2010/1/12 Mattias Persson matt...@neotechnology.com: There's some classes I usually use (fits me, since I wrote them :) ). They are in https://svn.neo4j.org/laboratory/users/mattias/neo-test-fw/ 2010/1/12 Raul Raja Martinez raulr...@gmail.com: Anybody care to share a strategy or base classes for unit testing operations in the graph? ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Read Transactions? Really?
I understand but imaging this case... Model - Person encapsulates underlying node that access name property Controller - Handler that receives http request requests a model has transaction and delegates the model to a template to be rendered View - Template that gets rendered with domain objects If a property accessed in the view from a domain object delegates the access to the value to a delegate node wrapped in the model it would be outside a transaction thus failing if it required one. If you have a view object that gets fully initialized from a domain object before passing to the view then it works but requires a extra layer of view objects and forces initialization of properties that might not ever be used in the rendering phase. 2010/1/12 Ryan Levering rrlever...@gmail.com: Then this wouldn't be M-V-C, it would be M/C-V. The point of having that separate layer is that you don't need to worry about the model abstraction. Generally, controller level transactions should only exist because you actually want to force multi-operation atomicity, not because you know the underlying database needs to be transactionalized. And again, if your controller is doing a lot of reading, it will have to be split up into multiple transaction blocks. Ryan On Jan 12, 2010, at 5:28 AM, Mattias Persson wrote: Your code doesn't need to be littered with transaction management. You can have a very big MVC (Model-View-Controller) application or something like that and _only_ have transaction handling in one place... in the Controller. See http://wiki.neo4j.org/content/Transactions#Best_practices (page under construction) 2010/1/12 Laurent Laborde kerdez...@gmail.com: On Tue, Jan 12, 2010 at 6:41 AM, Ryan Levering rrlever...@gmail.com wrote: 1. User.getName() I want a property off the node. I assume I encapsulate the single internal getProperty call in a transaction? My accessor code has just quadrupled in size, even if I am ok with this silly, time-consuming transaction. Leaving it out means every piece of code in my system has to understand to start a neo4j transaction just to get the user's name. 2: for (User user : getUsers()) I want to do something to all the users in the system. There are A LOT of them and they are all linked to a UserFactory or whatever. I can't create an internal list, because they won't all fit in memory. So I wrap the IterableRelationship/Node in an Iterable that wraps the nodes in POJOs. Only there is no way to do this without enclosing that loop in a transaction. Ha ! Thank you for explaining my problem and POV in a much more readable english than me :) Do people just create giant transactions that wrap their entire programs since they don't work with large graphs/models? Or is your code entirely littered with transaction = service.beginTx(); transaction.success(); transaction.finish()? When i can, i have short transaction everywhere. When i have a very big traverser to do ... well... i have a top-level transaction. Another workaround i use is to fill an array with the traverser, and i work on the array. That why i switched to 8GB on my desktop PC -_-' I've looked around on the wiki and site and I can't find good answers to these questions. The IMDB example transactionalizes things at a very high level. In a sense, wrapping the entire program (web request) in a single transaction. I guess that's fine if you have small operations and use Spring injection, but what if you don't? You do it the Oracle(tm) way : buy more ram :) -- Laurent ker2x Laborde Sysadmin DBA at http://www.over-blog.com/ ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Neo Technology, www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] integer and substring searching using LuceneFulltextQueryIndexService
AFAIK this behavior is specific to lucene as lucene indexes everything as Strings following their natural order. http://wiki.apache.org/lucene-java/SearchNumericalFields 2010/1/11 Zerony Zhao bw.li...@gmail.com: Hi Neo4j users, I am confused with LuceneFulltextQueryIndexService. For integer, IndexService index = new LuceneFulltextQueryIndexServic e( neo ); index.index( myNode1, someKey, 1 ); index.index( myNode2, someKey, 2 ); index.index( myNode3, someKey, 12 ); index.index( myNode4, someKey, 21 ); // This will return myNode1, myNode2 and myNode3, not myNode4, why? index.getNodes( someKey, [1 TO 2] ); If I really want 1 = x = 2, I need use the padding pattern. Is there any other ways to do this work? For string, myNode1 somekey -- a myNode2 somekey -- b myNode3 somekey -- aa IndexService index = new LuceneFulltextQueryIndexService( neo ); index.index( myNode1, someKey, a ); index.index( myNode2, someKey, ba ); index.index( myNode3, someKey, ad ); // This will return myNode1 index.getNodes( someKey, a ); // This will return myNode1, myNode3 index.getNodes( someKey, a* ); //Wrong (Lucene does not support *a* index.getNodes( someKey, *a* ); How can I search substring *a*? And what the mechanism difference of Lucene indexing between integer and string, if it can be explained here in short sentences? Appreciate your help, Zerony ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Read Transactions? Really?
Not that this solves your issue but... 1. If it is a webapp you can use a filter that wraps the request around a transaction try catch finally... then your code would be transaction free (mostly) 2. If you use a Spring service you can use @Transational or AOP interceptors to annotate operations that are transactional. 3. If you use plain old java code... Well isn't the nature of transactions itself and the fact that you your code may end up in a exception force you to try catch if you want your transactions to rollback?. Java exception handling imposed try/catch blocks is to be blame for this not the fact that you have to programmatically handle transactions I can think of some nasty solutions where you could weave your classes to intercept neo calls around transactions without having to explicitly type transactions but it would still result in one transaction per operation. In my opinion even client side or console based apps can benefit from using an IOC container like Spring that lets you handle transactions with annotations or interceptors. Having said that, I agree that is not common to have to explicitly declare a transaction for reads. Maybe neo could internally create those transactions when needed on reads. I also understand that in a heavy optimized system locks and writes may be required when reading if the structures need to be balanced based on usage to provide a more efficient traversal the next time around. I agree with you, I run into the same issue when we started working with neo and found I had to declare transactions around traversers. 2010/1/11 Ryan Levering rrlever...@gmail.com: I know this has come up a couple of times and that transactions are being worked on it Traversers, but I just can't get my head around how people use neo4j transactions in general. When I started writing neo4j code a couple days ago, I was amazed at two things: 1) how efficient and fast it was and 2) how easy it was to write. It remained that way until I started writing more interesting, deep, and large models. Now, after several days of working with it, I am fed up with transaction management in Neo. I feel that the implementation and strict reliance on transactions ruins those two things that I loved about it. I understand transaction management in general and have implemented several systems that relied on SQL transaction management, but when the simplest read requires a transaction, it makes the code very cumbersome. In addition, my profiling shows that a lot of program time is being spent on transaction management in very simple reading code. I'm hoping that the reason is that I just haven't figured out the correct usage pattern for working on nodes and, very importantly, encapsulating the transaction management in my model. I'm really looking for a solution here, not just criticizing neo4j. Let's say I have a User object, for purposes of example. Here are two basic things I want to do: 1. User.getName() I want a property off the node. I assume I encapsulate the single internal getProperty call in a transaction? My accessor code has just quadrupled in size, even if I am ok with this silly, time-consuming transaction. Leaving it out means every piece of code in my system has to understand to start a neo4j transaction just to get the user's name. 2: for (User user : getUsers()) I want to do something to all the users in the system. There are A LOT of them and they are all linked to a UserFactory or whatever. I can't create an internal list, because they won't all fit in memory. So I wrap the IterableRelationship/Node in an Iterable that wraps the nodes in POJOs. Only there is no way to do this without enclosing that loop in a transaction. Do people just create giant transactions that wrap their entire programs since they don't work with large graphs/models? Or is your code entirely littered with transaction = service.beginTx(); transaction.success(); transaction.finish()? I've looked around on the wiki and site and I can't find good answers to these questions. The IMDB example transactionalizes things at a very high level. In a sense, wrapping the entire program (web request) in a single transaction. I guess that's fine if you have small operations and use Spring injection, but what if you don't? ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Abstract unit test cases
Anybody care to share a strategy or base classes for unit testing operations in the graph? ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Retrieving nodes ordered by property
Hi Tobias, Thanks for the info! I understand the implications of returning ordered nodes. Do you guys plan to build such support natively to neo4j even if it is for a basic set of node properties? While retrieving results ordered by property values or more extensive queries capabilities is provably not easy when dealing with a graph, it is currently one of the main things a programmer encounters when comparing neo4j to relational databases for storage and info retrieval. Would it make sense to have something like node.setOrderedProperty(property, value); so that the linked list based on the natural order of the value is kept internally by neo4j and then have another param in the traversers to specify that the traversal should follow the relationships established by the ordered linked nodes? If you guys think ordering based on property is out of the scope that's fine, I'm just asking in case we can avoid having to roll our own hack or component to get it working since ordered entities is a requirement in all of our current projects where we plan to use neo4j Thanks Raul 2010/1/4 Tobias Ivarsson tobias.ivars...@neotechnology.com: Getting ordered results from any system always requires sorting, unless the ordering property is stored. And sorting always requires (at least) O(n) memory and O( log(n!) ) time for comparison sort, possibly O( n ) time for sorting integer keys. So if you want results to be sorted on an arbitrary property you will have to sort the entire result set and keep it around during your pagination process (possibly redoing the query + entire sorting + skipping a few pages, to preserve memory). If you know which properties you will want to order your results by when you are designing your database you can store the ordering information in the database. I would suggest a linked list of relationships in between the nodes, in the natural order for the sorting property. You need to be aware of two things with this approach though. The first one is that if the sorting property is unrelated to the filtering property, if you want something like give me all nodes where x=15 ordered by y, you will either have to store separate linked lists for each value of x, filter the result set while traversing through the results as they are ordered by y, or revert to the sorting approach. The second thing to be aware of is that insertion (and changing the value of the ordering property in some node) will require some overhead, to preserve the order. If your queries are as simple as give me all nodes where xLOWER_LIMIT and xUPPER_LIMIT ordered by x, and you can reduce the x property to a long integer value, then the timeline index will do this for you. Otherwise there is no ready made component for this today. Happy hacking, Tobias On Mon, Jan 4, 2010 at 1:30 AM, Raul Raja Martinez raulr...@gmail.comwrote: Hi, Anybody has any experience returning indexed nodes ordered by a given property?. For example return all nodes ordered by creationDate. I understand that if the node property is not indexed I'd have to iterate over all nodes first then order then limit the results which seems overkill to me. I'd like to be able to do... get me all nodes from start to limit ordered by property. This is necessary when the data is iterated over using pagination and the order determines what the next start node is next. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Retrieving nodes ordered by property
Thanks for all your answers, Everything said makes sense. I will get back to you guys with what we end up doing in case it is of any value for anyone else. 2010/1/4 Craig Taverner cr...@amanzi.com: I think a generic solution would mean a generic numerical index. I wrote an index recently that is very similar in principle to the timeline index Tobias mentioned below, except I allow multiple dimensions and indexing over any numerical primitive type. Then we get to the next point, combining indexes. Like in Tobias example: give me all nodes where x=15 ordered by y. This requires an index that works on x and y at the same time. Otherwise you have to index one, and do a brute force search on the other in the result set. I have an idea about dynamically generating a combined index based on queries received by the database. So if the user does a query like the one above, involving both x and y, and both of these are indexed, then a combined index can be added dynamically. The first query will not be fast, but subsequent ones would be. And as far as I can see, a combined index should be as simple as linking nodes between the un-combined indexes. Now who's up for a n-way combination of k-dimensional indexes? I think I'm getting a headache just thinking about it :-) On Mon, Jan 4, 2010 at 10:17 AM, Raul Raja Martinez raulr...@gmail.comwrote: Hi Tobias, Thanks for the info! I understand the implications of returning ordered nodes. Do you guys plan to build such support natively to neo4j even if it is for a basic set of node properties? While retrieving results ordered by property values or more extensive queries capabilities is provably not easy when dealing with a graph, it is currently one of the main things a programmer encounters when comparing neo4j to relational databases for storage and info retrieval. Would it make sense to have something like node.setOrderedProperty(property, value); so that the linked list based on the natural order of the value is kept internally by neo4j and then have another param in the traversers to specify that the traversal should follow the relationships established by the ordered linked nodes? If you guys think ordering based on property is out of the scope that's fine, I'm just asking in case we can avoid having to roll our own hack or component to get it working since ordered entities is a requirement in all of our current projects where we plan to use neo4j Thanks Raul 2010/1/4 Tobias Ivarsson tobias.ivars...@neotechnology.com: Getting ordered results from any system always requires sorting, unless the ordering property is stored. And sorting always requires (at least) O(n) memory and O( log(n!) ) time for comparison sort, possibly O( n ) time for sorting integer keys. So if you want results to be sorted on an arbitrary property you will have to sort the entire result set and keep it around during your pagination process (possibly redoing the query + entire sorting + skipping a few pages, to preserve memory). If you know which properties you will want to order your results by when you are designing your database you can store the ordering information in the database. I would suggest a linked list of relationships in between the nodes, in the natural order for the sorting property. You need to be aware of two things with this approach though. The first one is that if the sorting property is unrelated to the filtering property, if you want something like give me all nodes where x=15 ordered by y, you will either have to store separate linked lists for each value of x, filter the result set while traversing through the results as they are ordered by y, or revert to the sorting approach. The second thing to be aware of is that insertion (and changing the value of the ordering property in some node) will require some overhead, to preserve the order. If your queries are as simple as give me all nodes where xLOWER_LIMIT and xUPPER_LIMIT ordered by x, and you can reduce the x property to a long integer value, then the timeline index will do this for you. Otherwise there is no ready made component for this today. Happy hacking, Tobias On Mon, Jan 4, 2010 at 1:30 AM, Raul Raja Martinez raulr...@gmail.com wrote: Hi, Anybody has any experience returning indexed nodes ordered by a given property?. For example return all nodes ordered by creationDate. I understand that if the node property is not indexed I'd have to iterate over all nodes first then order then limit the results which seems overkill to me. I'd like to be able to do... get me all nodes from start to limit ordered by property. This is necessary when the data is iterated over using pagination and the order determines what the next start node is next. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org
Re: [Neo] Persisting store metadata
We personally use a facade/factory that allows us to interface to neo4j and the Lucene indexes as if it was a single system. This service is a singleton initialized with Spring with an init and destroy method that properly starts/shutdown the store and index. If it was up to me the index utils would be part of the core neo and have a node.setIndexedProperty() or some kind of api that does exactly the same as setProperty but also indexes the property. We use Spring but any kind of IOC would do similar and our config looks like this, wrapping the services to properly manage both the neo and index services lifecycle... bean id=entityFactory class=com.yourpackage.Neo4jEntityFactory autowire=autodetect init-method=init destroy-method=destroy /property property name=neoService bean id=neoService class=org.neo4j.api.core.EmbeddedNeo constructor-arg index=0 value=${graph.filestore}/ /bean /property property name=indexService bean class=org.neo4j.util.index.LuceneIndexService constructor-arg index=0 ref=neoService / property name=isolation value=SAME_TX / property name=sorting value=INDEXORDER / /bean /property property name=fullTextIndexService bean class=com.yourpackage.LuceneFullTextSimpleQueryIndexService constructor-arg index=0 ref=neoService / property name=isolation value=SAME_TX / property name=sorting value=INDEXORDER / /bean /property /bean Another IOC containers allow registration of services by simply dropping the jar and providing contribution to the system extension points so that if you drop the index utils in the classpath you can contribute to lifecycles such as init, destroy, per request, around transaction, etc... 2010/1/4 Tobias Ivarsson tobias.ivars...@neotechnology.com: Hi, Neo4j today is made up of a core service that persists nodes, relationships and properties. In addition to that there are a number of additional services, where index-util is probably the most used and most important at the moment. index-util is also a good example of the problem I would like to discuss in this email. With the current architecture there is no way of introspecting which additional services have been initialized with a particular Neo store, for example there is no way of telling which (if any) IndexService has been used. This causes problems with the transaction recovery process, where the recovery mechanism today has to know about all possible extension services without having a compile time dependency on them. Needless to say this causes a mess. The code for this is ugly and borderline-buggy. It also causes a problem when introspecting a store, or even restarting your software, since it is up to the programmer to remember to restart all of the same services as last time. I would like to propose an addition to the store, in a separate metadata file in the store directory, where we store a simple list of all started additional services. Does anyone have any suggestions to what this should look like to be reasonable future proof (i.e. be able to handle some service that is not implemented yet as well as the current index-utils). What comes to mind is something similar to the Java ServiceLoader API [1], but simplified with the fact that we can require all classes referenced in the file to implement one specific interface, and we know where the file will be, in effect taking out all the complications involving ClassLoaders. What I'm worrying about is that storing Java classnames in a file will make this metadata unusable to any future implementations of the NeoStore format outside of the Java platform. Any suggestions to how information about loaded services can be stored is welcome, my idea of a clear text file might not even be good (since people have a tendency to think they can patch those manually). Since we have a community of smart and entrepreneurial individuals I thought I'd throw the question out here and see if I got any good responses. Cheers, Tobias [1] http://java.sun.com/javase/6/docs/api/java/util/ServiceLoader.html Simple explanation of how it works: * You have an interface with fully qualified classname: com.somedomain.someproduct.ServiceInterface * In each jar that provides implementations of the ServiceInterface you add a file called META-INF/services/com.somedomain.someproduct.ServiceInterface * In this file you put the fully qualified classnames of each implementing class separated by newline * In your code you do: ServiceLoaderServiceInterface impls = ServiceLoader.load(ServiceInterface.class); -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com
Re: [Neo] Neo in a cluster?
I understand. When I started working on our annotation based project I was doing the approach of loading all properties in the bean but found that by doing so if you use the beans inside ReturnEvaluator it has to marshall all properties on each iteration when you want to compare something like if (person.getName().equals(otherPerson.getName())). In a ORM approach where you use a relational db this makes sense since the data resides in the db and you have to bring it to memory but when doing this with neo the data may already be in memory and it is getting duplicated in all the instances of the bean on a per thread basis. Marshaling this way makes the traversal slow and when we took that approach we had to back out of it. Also by using an api that simple queries for a single type like Give me the people with name equal to something is really like using a regular relational database since is not taking advantage of the power of traversing the graph. We use Java dynamic proxies to implement the annotated interfaces and load only the properties used in the traversal comparison or in any other place where a property is called. By doing so the traversing operation is as fast as using the low level node access and the returned objects properties are only initialized when the getter is invoked by delegating to the node. The reason why I'm telling you all of this is that because we already went that path and didn't work for us in case you can benefit from the experience. 2010/1/3 Avishay Orpaz avish...@yahoo.com I load all properties eagerly. In order to load properties lazily, one needs either to use proxies or bytecode manipulation. Each technique has its own strenghts and weaknesses, and I'm using neither. Hopefully, one day I'll integrate my framework with neo-weave, a combination that will give this functionality. Avishay From: Raul Raja Martinez raulr...@gmail.com To: Neo user discussions user@lists.neo4j.org Sent: Sun, January 3, 2010 2:09:36 AM Subject: Re: [Neo] Neo in a cluster? In your framework, do all properties of a pojo get loaded when you load the pojo by calling getProperty on the nodes? Looking fwd to learn more about it. On Jan 2, 2010 3:43 PM, Avishay Orpaz avish...@yahoo.com wrote: Wow... it's becoming crowded out there. I'm also designing an annotation-based persistence framework for neo4j. Hopefully, it will become public (open source, of course) in a few days. Generally it follows the same lines as jo4neo (a projects I learned about just recently). There are, however, some noticable differences: 1. Transactions are left out of the framework, and their management is left to the user 2. Relationships can also be mapped to POJOs 3. Embedded objects are not stored in serialized binary form. They can be annotated and stored as key-value pairs, just like the main object. There is more, but I'll like to hear about other features that can be interesting in such a framework. Avishay From: Raul Raja Martinez raulr...@gmail.com To: Neo user discussions user@lists.neo4j.org Sent: Sat, January 2, 2010 2:09:09 AM Subject: Re: [Neo] Neo in a cluster? Hi Peter, Yes we looked at jo4neo and found it very interesting and it probably suits most people u... ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Retrieving nodes ordered by property
Hi, Anybody has any experience returning indexed nodes ordered by a given property?. For example return all nodes ordered by creationDate. I understand that if the node property is not indexed I'd have to iterate over all nodes first then order then limit the results which seems overkill to me. I'd like to be able to do... get me all nodes from start to limit ordered by property. This is necessary when the data is iterated over using pagination and the order determines what the next start node is next. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo in a cluster?
Hi Anders, We tried following the example but found that when used in a webapp the whole request needs to be wrapped since neo4j requires transactions for reads too. We use some objects that lazily print properties in tapestry / jsp pages and eventually we'd get errors since the call to read the properties where outside the scope of @Transactional We ended up writing a small filter that does the trick for us... /** * Advices transactional requests by wrapping them in a transaction */ public class GraphTransactionFilter implements Filter { private ServletContext servletContext; private NeoService neoService; /** * The codedoFilter/code method of the Filter is called by the container * each time a request/response pair is passed through the chain due * to a client request for a resource at the end of the chain. The FilterChain passed in to this * method allows the Filter to pass on the request and response to the next entity in the * chain.p * A typical implementation of this method would follow the following pattern:- br * 1. Examine the requestbr * 2. Optionally wrap the request object with a custom implementation to * filter content or headers for input filtering br * 3. Optionally wrap the response object with a custom implementation to * filter content or headers for output filtering br * 4. a) strongEither/strong invoke the next entity in the chain using the FilterChain object (codechain.doFilter()/code), br * * 4. b) strongor/strong not pass on the request/response pair to the next entity in the filter chain to block the request processingbr * * 5. Directly set headers on the response after invocation of the next entity in the filter chain. */ public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { if (neoService == null) { initServices(); } Transaction tx = neoService.beginTx(); try { chain.doFilter(request, response); tx.success(); } catch (Throwable t) { tx.failure(); } finally { tx.finish(); } } private synchronized void initServices() { WebApplicationContext applicationContext = WebApplicationContextUtils.getWebApplicationContext(servletContext); if (applicationContext != null) { neoService = (NeoService) applicationContext.getBean(neoService); } } /** * Called by the web container to indicate to a filter that it is being placed into * service. The servlet container calls the init method exactly once after instantiating the * filter. The init method must complete successfully before the filter is asked to do any * filtering work. brbr * p/ * The web container cannot place the filter into service if the init method eitherbr * 1.Throws a ServletException br * 2.Does not return within a time period defined by the web container */ public void init(FilterConfig filterConfig) throws ServletException { ServletContext context = filterConfig.getServletContext(); servletContext = context; initServices(); } /** * Called by the web container to indicate to a filter that it is being taken out of service. This * method is only called once all threads within the filter's doFilter method have exited or after * a timeout period has passed. After the web container calls this method, it will not call the * doFilter method again on this instance of the filter. brbr * p/ * This method gives the filter an opportunity to clean up any resources that are being held (for * example, memory, file handles, threads) and make sure that any persistent state is synchronized * with the filter's current state in memory. */ public void destroy() { } } 2010/1/2 Anders Nawroth and...@neotechnology.com Hi Raul! allow you to group operations ina transactional context. We also integrate with Spring and we are working on the @Transactional support The Neo4j IMDB example uses Spring @Transactional: http://wiki.neo4j.org/content/IMDB_Transaction_handling Just in case you didn't know! /anders ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo in a cluster?
Hi Rick, I'm not aware of any metrics for datanucleus. I have personally used it only in the context of appengine. Google uses datanucleus as a JPA / JDO Bridge to their data backend for java appengine apps. http://code.google.com/appengine/docs/java/datastore/usingjpa.html As far as I understand datanucleus the speed of the sorting will rather depend how you implement the access to neo as a datanucleus store. We have the same problem for sorting and ordering results coming from the neo store, if you have any tips they'll be much appreciated. 2010/1/2 Rick Bullotta rick.bullo...@burningskysoftware.com Raul, do you know of any performance metrics/examples for the Datanucleus access/query layer? We've currently implemented our own SQL-like query layer on top of Neo (and other non-SQL sources), but would be interested in exploring Datanucleus if the performance implications of the extra layers aren't too substantial (our current implementation is very specifically optimized for our use case, and can query/filter/sort/serialize a few thousand records in about 30mS). In particular, I'd be interested to know if there are any examples of an in-memory provider/persistence layer being queried using JDOQL, SQL, or JPQL that could be used to gauge raw performance of the query layer (filtering, sorting, aggregates, etc.). -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Raul Raja Martinez Sent: Friday, January 01, 2010 11:16 PM To: Neo user discussions Subject: Re: [Neo] Neo in a cluster? I forgot to mention that if implementing JPA/JDO it'd provably good to do it as a Datanucleus store for example http://www.datanucleus.org/plugins/store.db4o/index.html 2010/1/1 Raul Raja Martinez raulr...@gmail.com Hi Peter, Yes we looked at jo4neo and found it very interesting and it probably suits most people use cases. In our particular case these are the reasons why we didn't choose it. 1. jo4neo is tightly couple to neo4j, our implementation is based on neo4j but the interface impls are defined so that they can be swapped for other graph based storage in case we decide to not use neo4j in some other project. 2. Our implementation never handles transactions directly, jo4neo does. jo4neo does not allow control over transactions and creates a transaction per operation http://jo4neo.googlecode.com/svn/trunk/jo4neo/src/main/java/jo4neo/DeleteOpe rtation.java For a webapp we provide a filter that wraps the request in a transaction, allowing to groups operations, and we also plan to support Callbacks that allow you to group operations ina transactional context. We also integrate with Spring and we are working on the @Transactional support 3. jo4neo loads all properties for a bean even when these are not queried or used. We enforce the use of interfaces not beans and we create a dynamic proxy that implements the interface so that calls to properties and relationships are proxied and delegated to the underlying nodes without the need to use reflection. So when you load a Object by id nothing we do not load all its properties or relationships unless you use them, and even when you use them they don't get cached in the proxy so it saves memory. We trust that neo4j caches the nodes and properties that are used the most. As far as JPA is concerned... Yes and No. I think you'd gain more acceptance if you implement JPA, on the other hand the JPA spec pretty much assumes the storage is based on a relational database, they say... The Java Persistence API deals with the way relational data is mapped to Java objects (persistent entities), the way that these objects are stored in a relational database so that they can be accessed at a later time, and the continued existence of an entity's state even after the application that uses it ends. In addition to simplifying the entity persistence model, the Java Persistence API standardizes object-relational mapping. So I'd be great to have a partial JPA impl for basic querying and annotations but neo4j approach to persistence is much more flexible and not constrained by the relational model. 2010/1/1 Peter Neubauer peter.neuba...@neotechnology.com Raul, thanks for the info! Have you looked at Taylor's jo4neo, http://code.google.com/p/jo4neo/ which is taking a similar approach, and do you think there would be value in having a JPA adapter for Neo4j? We would be happy to hear about your experience there! Happy New Year! /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org- Relationships count. http://gremlin.tinkerpop.com- PageRank in 2 lines of code. http
Re: [Neo] Neo in a cluster?
In your framework, do all properties of a pojo get loaded when you load the pojo by calling getProperty on the nodes? Looking fwd to learn more about it. On Jan 2, 2010 3:43 PM, Avishay Orpaz avish...@yahoo.com wrote: Wow... it's becoming crowded out there. I'm also designing an annotation-based persistence framework for neo4j. Hopefully, it will become public (open source, of course) in a few days. Generally it follows the same lines as jo4neo (a projects I learned about just recently). There are, however, some noticable differences: 1. Transactions are left out of the framework, and their management is left to the user 2. Relationships can also be mapped to POJOs 3. Embedded objects are not stored in serialized binary form. They can be annotated and stored as key-value pairs, just like the main object. There is more, but I'll like to hear about other features that can be interesting in such a framework. Avishay From: Raul Raja Martinez raulr...@gmail.com To: Neo user discussions user@lists.neo4j.org Sent: Sat, January 2, 2010 2:09:09 AM Subject: Re: [Neo] Neo in a cluster? Hi Peter, Yes we looked at jo4neo and found it very interesting and it probably suits most people u... ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Neo in a cluster?
Hi Peter, Yes we looked at jo4neo and found it very interesting and it probably suits most people use cases. In our particular case these are the reasons why we didn't choose it. 1. jo4neo is tightly couple to neo4j, our implementation is based on neo4j but the interface impls are defined so that they can be swapped for other graph based storage in case we decide to not use neo4j in some other project. 2. Our implementation never handles transactions directly, jo4neo does. jo4neo does not allow control over transactions and creates a transaction per operation http://jo4neo.googlecode.com/svn/trunk/jo4neo/src/main/java/jo4neo/DeleteOpertation.java For a webapp we provide a filter that wraps the request in a transaction, allowing to groups operations, and we also plan to support Callbacks that allow you to group operations ina transactional context. We also integrate with Spring and we are working on the @Transactional support 3. jo4neo loads all properties for a bean even when these are not queried or used. We enforce the use of interfaces not beans and we create a dynamic proxy that implements the interface so that calls to properties and relationships are proxied and delegated to the underlying nodes without the need to use reflection. So when you load a Object by id nothing we do not load all its properties or relationships unless you use them, and even when you use them they don't get cached in the proxy so it saves memory. We trust that neo4j caches the nodes and properties that are used the most. As far as JPA is concerned... Yes and No. I think you'd gain more acceptance if you implement JPA, on the other hand the JPA spec pretty much assumes the storage is based on a relational database, they say... The Java Persistence API deals with the way relational data is mapped to Java objects (persistent entities), the way that these objects are stored in a relational database so that they can be accessed at a later time, and the continued existence of an entity's state even after the application that uses it ends. In addition to simplifying the entity persistence model, the Java Persistence API standardizes object-relational mapping. So I'd be great to have a partial JPA impl for basic querying and annotations but neo4j approach to persistence is much more flexible and not constrained by the relational model. 2010/1/1 Peter Neubauer peter.neuba...@neotechnology.com Raul, thanks for the info! Have you looked at Taylor's jo4neo, http://code.google.com/p/jo4neo/ which is taking a similar approach, and do you think there would be value in having a JPA adapter for Neo4j? We would be happy to hear about your experience there! Happy New Year! /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org- Relationships count. http://gremlin.tinkerpop.com- PageRank in 2 lines of code. http://www.linkedprocess.org - Computing at LinkedData scale. On Thu, Dec 31, 2009 at 10:53 PM, Raul Raja Martinez raulr...@gmail.com wrote: Hi Johan, It does and we're very excited about Neo4j. Can't wait for your clustering support. We have developed a annotation based sytem on top of Neo based on interfaces and java dynamic proxies that allows you to delegate all state lookup and relationship to the neo store such as @Node public interface Person { @Id Long getId(); @Property(indexed=true,unique=true,fulltext=true) String getName(); void setName(String name); @Relationship City getCity(); @Relationship ListPerson getFriends(); @Traverser(returnableEvaluator=Whatever.class) ListPerson getAllFriendsCloseBy(); } It's lazy lookup based system that wraps the underlying nodes and delegates calls to the right operations on the node. It supports Date, Enums and any arbitrariy types that can be configured through converters. We're going to run it in a prod system in the next few months and plan to release it open source. We have extensive experience with hibernate, and other jpa based implementations and the ease of use and speed with neo4j so far is better when it comes to complex relaationships or operation that require multiple joins in a fully normalized relational model. Our current challenge is returning ordered relationships when displaying the data since it requires the node/entities returned in a specific order based on node property values. Anyway thanks for your responses and good job with Neo4J 2009/12/31 Johan Svensson jo...@neotechnology.com Hi, On Wed, Dec 30, 2009 at 5:07 PM, Raul Raja Martinez raulr...@gmail.com wrote: Hi everybody, We're evaluating neo4j and we're very pleased with it so far. I have a few questions/concerns as far
Re: [Neo] Neo in a cluster?
I forgot to mention that if implementing JPA/JDO it'd provably good to do it as a Datanucleus store for example http://www.datanucleus.org/plugins/store.db4o/index.html 2010/1/1 Raul Raja Martinez raulr...@gmail.com Hi Peter, Yes we looked at jo4neo and found it very interesting and it probably suits most people use cases. In our particular case these are the reasons why we didn't choose it. 1. jo4neo is tightly couple to neo4j, our implementation is based on neo4j but the interface impls are defined so that they can be swapped for other graph based storage in case we decide to not use neo4j in some other project. 2. Our implementation never handles transactions directly, jo4neo does. jo4neo does not allow control over transactions and creates a transaction per operation http://jo4neo.googlecode.com/svn/trunk/jo4neo/src/main/java/jo4neo/DeleteOpertation.java For a webapp we provide a filter that wraps the request in a transaction, allowing to groups operations, and we also plan to support Callbacks that allow you to group operations ina transactional context. We also integrate with Spring and we are working on the @Transactional support 3. jo4neo loads all properties for a bean even when these are not queried or used. We enforce the use of interfaces not beans and we create a dynamic proxy that implements the interface so that calls to properties and relationships are proxied and delegated to the underlying nodes without the need to use reflection. So when you load a Object by id nothing we do not load all its properties or relationships unless you use them, and even when you use them they don't get cached in the proxy so it saves memory. We trust that neo4j caches the nodes and properties that are used the most. As far as JPA is concerned... Yes and No. I think you'd gain more acceptance if you implement JPA, on the other hand the JPA spec pretty much assumes the storage is based on a relational database, they say... The Java Persistence API deals with the way relational data is mapped to Java objects (persistent entities), the way that these objects are stored in a relational database so that they can be accessed at a later time, and the continued existence of an entity's state even after the application that uses it ends. In addition to simplifying the entity persistence model, the Java Persistence API standardizes object-relational mapping. So I'd be great to have a partial JPA impl for basic querying and annotations but neo4j approach to persistence is much more flexible and not constrained by the relational model. 2010/1/1 Peter Neubauer peter.neuba...@neotechnology.com Raul, thanks for the info! Have you looked at Taylor's jo4neo, http://code.google.com/p/jo4neo/ which is taking a similar approach, and do you think there would be value in having a JPA adapter for Neo4j? We would be happy to hear about your experience there! Happy New Year! /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org- Relationships count. http://gremlin.tinkerpop.com- PageRank in 2 lines of code. http://www.linkedprocess.org - Computing at LinkedData scale. On Thu, Dec 31, 2009 at 10:53 PM, Raul Raja Martinez raulr...@gmail.com wrote: Hi Johan, It does and we're very excited about Neo4j. Can't wait for your clustering support. We have developed a annotation based sytem on top of Neo based on interfaces and java dynamic proxies that allows you to delegate all state lookup and relationship to the neo store such as @Node public interface Person { @Id Long getId(); @Property(indexed=true,unique=true,fulltext=true) String getName(); void setName(String name); @Relationship City getCity(); @Relationship ListPerson getFriends(); @Traverser(returnableEvaluator=Whatever.class) ListPerson getAllFriendsCloseBy(); } It's lazy lookup based system that wraps the underlying nodes and delegates calls to the right operations on the node. It supports Date, Enums and any arbitrariy types that can be configured through converters. We're going to run it in a prod system in the next few months and plan to release it open source. We have extensive experience with hibernate, and other jpa based implementations and the ease of use and speed with neo4j so far is better when it comes to complex relaationships or operation that require multiple joins in a fully normalized relational model. Our current challenge is returning ordered relationships when displaying the data since it requires the node/entities returned in a specific order based on node property values. Anyway thanks for your responses and good job with Neo4J 2009/12/31 Johan Svensson