Re: [Neo4j] social network may exceed the number of nodes / relationships / properties
Hi, I may be mistaken, but I think your estimates are a bit high. Neo4j is running in production in many social deployments. If you want to build a social network, then Neo4j is the way to go. By the time your social network outgrows the current limits of Neo4j, we will most likely support sharding for wide-scale deployments. Feel free to keep us posted on your progress. David On Wed, Nov 30, 2011 at 9:24 AM, gustavoboby gustavob...@gmail.com wrote: Hi peoples, I need to do in a social network the same of facebook does into a private messages and post's. However, I have the following concern: It is expected up to 1 million users in the social network, if each person to write one million times (between posts and private messages) in one year. My fear is exceeding the number of properties / relationships / nodes How would you do in this situation? what do you recommend? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/social-network-may-exceed-the-number-of-nodes-relationships-properties-tp3549025p3549025.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Server plugin running into memory limits
Hi Anders, How much heap are you giving the JVM? When it worked with tons of memory, how much heap did you give? Maybe you could also try changing the garbage collector used in conf/neo4j-wrapper.conf. Just add these lines: wrapper.java.additional.20=-XX:+UseConcMarkSweepGC wrapper.java.additional.21=-verbose:gc David 2011/11/15 Anders Lindström andli...@hotmail.com Hi all, I'm currently writing a server plugin. I need it to make some specialized queries that are not supported by the standard REST API. The important methods I expose are 'query' and 'get_next_page', the latter to support results pagination (i.e. the plugin is stateful). In 'query', I run my query against the Neo4j backend, and store a Node iterator to the query results (this is either an iterator originating from 'getAllNodes', or a Lucene IndexHitsNode instance). In 'get_next_page', I run through the next N items of the iterator and return these as a ListRepresentation. The same iterator object is kept across all page retrievals, but of course stepped forward N steps for every invocation. After having gone through all pages, the reference to the Node iterator is removed. Now, as I understand it, all the heap space I should be concerned about using, is the one I allocate locally in my methods, since the referenced stored to the iterator object is just a tiny reference, and iterator results are fetched lazily (i.e., even though the iterator covers a result set greater than the allotted heap size, I shall be able to page through it within given heap space if the page size is small enough). But when I run my plugin, this does not seem to be the case. I can make several successful calls in a row to 'get_next_page', but then after a while bump into GC overhead limit exceeded which I cannot quite understand. I am rather certain the size of each page returned is within the allotted heap size. For some reason the heap usage seems to grow with the calls to 'get_next_page' which I cannot understand, given my understanding of the Node iterators from Neo4j. How do I avoid hitting this GC overhead limit? Am I missing something? (And yes, I've tried using different values of the allowed heap space by fiddling in the conf-files, and sure I can give tons of memory to the instance, and then it works, but I shouldn't have to give more heap space than what Neo4j needs, plus my page size). Thanks! Regards,Anders ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j REST server's log files
Yup, Skype is good. Does some time in the afternoon PST work for you? David On Fri, Nov 11, 2011 at 3:11 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, I'm happy to. Do we use skype? Thanks, A. From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 9:34 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Would you like to do a screen sharing session with me Friday PST? That way, I could assess your problem better. Thanks, David On Thu, Nov 10, 2011 at 4:23 PM, andrew ton andrewt...@yahoo.com wrote: What I meant the server stops responding to my application's request was my application received a NoHttpResponseException. This is the output in the Eclipse console: -Uploading /doc/test/ont/Ontology1320789957941.owl to Neo4J... Nov 10, 2011 3:46:52 PM org.restlet.ext.httpclient.HttpClientHelper start INFO: Starting the Apache HTTP client Root node of ontology Ontology1320789957941.owl: http://localhost:7474/db/data/index/node/my_index/name/Ontology1320789957941 Find node: http://localhost:7474/db/data/index/node/my_index/name/super_node Processing triple - http://localhost:7474/db/data/node/1,CONTAINS,http://localhost:7474/db/data/node/147 Find node: http://localhost:7474/db/data/index/node/my_index/name/lowEnergy Nov 10, 2011 3:47:22 PM org.restlet.ext.httpclient.internal.HttpMethodCall sendRequest WARNING: An error occurred during the communication with the remote HTTP server. org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101) If I upload only a few ontologies (no problems) and the log/neo4j.x.x.log showed the server processed the lowEnergy node : Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/data/node/1/relationships/out/CONTAINS on org.mortbay.jetty.HttpConnection@237dc815 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@375e293a Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: session=null Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: chain=org.neo4j.server.statistic.StatisticFilter-org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet holder=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call filter org.neo4j.server.statistic.StatisticFilter Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call servlet org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: RESPONSE /db/data/node/1/relationships/out/CONTAINS 200] Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/data/node/1/relationships on org.mortbay.jetty.HttpConnection@237dc815 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@375e293a Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: session=null Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: chain=org.neo4j.server.statistic.StatisticFilter-org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet holder=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call filter org.neo4j.server.statistic.StatisticFilter Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call servlet org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: RESPONSE /db/data/node/1/relationships 201 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/manage/server/monitor/fetch/1320942809 on org.mortbay.jetty.HttpConnection@66e90097 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: Got Session ID 9fdtr96dq71c1iqw1b8416rpc from cookie Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@375e293a Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: session=null Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet=org.neo4j.server.web.NeoServletContainer-1301864188 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: chain=org.neo4j.server.statistic.StatisticFilter
Re: [Neo4j] START a = node(10) vs START a = (10)
Yes, it is. Head over to http://neo4j.org for the download while it's hot :) David On Fri, Nov 11, 2011 at 9:15 AM, yobi johnny@yobistore.com wrote: Just wanna follow up, is 1.5 up yet? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/START-a-node-10-vs-START-a-10-tp3497036p3500347.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j REST server's log files
That works. See ya then. I'm ddmontag on Skype. David On Fri, Nov 11, 2011 at 9:55 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, Anytime in the afternoon is fine to me. How about 2:30pm? Andrew From: David Montag david.mon...@neotechnology.com To: UserList user@lists.neo4j.org Sent: Friday, November 11, 2011 8:59 AM Subject: Re: [Neo4j] Neo4j REST server's log files Yup, Skype is good. Does some time in the afternoon PST work for you? David On Fri, Nov 11, 2011 at 3:11 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, I'm happy to. Do we use skype? Thanks, A. From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 9:34 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Would you like to do a screen sharing session with me Friday PST? That way, I could assess your problem better. Thanks, David On Thu, Nov 10, 2011 at 4:23 PM, andrew ton andrewt...@yahoo.com wrote: What I meant the server stops responding to my application's request was my application received a NoHttpResponseException. This is the output in the Eclipse console: -Uploading /doc/test/ont/Ontology1320789957941.owl to Neo4J... Nov 10, 2011 3:46:52 PM org.restlet.ext.httpclient.HttpClientHelper start INFO: Starting the Apache HTTP client Root node of ontology Ontology1320789957941.owl: http://localhost:7474/db/data/index/node/my_index/name/Ontology1320789957941 Find node: http://localhost:7474/db/data/index/node/my_index/name/super_node Processing triple - http://localhost:7474/db/data/node/1,CONTAINS,http://localhost:7474/db/data/node/147 Find node: http://localhost:7474/db/data/index/node/my_index/name/lowEnergy Nov 10, 2011 3:47:22 PM org.restlet.ext.httpclient.internal.HttpMethodCall sendRequest WARNING: An error occurred during the communication with the remote HTTP server. org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101) If I upload only a few ontologies (no problems) and the log/neo4j.x.x.log showed the server processed the lowEnergy node : Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/data/node/1/relationships/out/CONTAINS on org.mortbay.jetty.HttpConnection@237dc815 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@375e293a Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: session=null Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: chain=org.neo4j.server.statistic.StatisticFilter-org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet holder=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call filter org.neo4j.server.statistic.StatisticFilter Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call servlet org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: RESPONSE /db/data/node/1/relationships/out/CONTAINS 200] Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/data/node/1/relationships on org.mortbay.jetty.HttpConnection@237dc815 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@375e293a Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: session=null Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: chain=org.neo4j.server.statistic.StatisticFilter-org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: servlet holder=org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call filter org.neo4j.server.statistic.StatisticFilter Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: call servlet org.neo4j.server.web.NeoServletContainer-1130213695 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: RESPONSE /db/data/node/1/relationships 201 Nov 10, 2011 3:46:52 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/manage/server/monitor/fetch/1320942809 on org.mortbay.jetty.HttpConnection@66e90097 Nov 10, 2011 3:46:52 PM
Re: [Neo4j] Neo4j REST server's log files
Hi Andrew, Good to hear that you got the logging sorted out. Regarding the actual issues, it sounds like you're describing two different things. One is that you index ontology nodes in one global index and therefore run into conflicts. The other is that the upload appears to stall for some reason. Regarding the global index, did you intend to design the system that way, or do you really want a separate index for each ontology? It sounds like that would be reasonable. As for the stalled upload, a thread dump during the slow processing would be most helpful. On a Linux system, you can capture that by doing kill -3 pid on the Java process. It should then go to console.log. Thanks, David On Thu, Nov 10, 2011 at 8:21 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, Thank you for getting back to me! Finally I can make it work by changing the log level from INFO (by default) to FINEST in the logging.property. I have a question for you though. Currently my project have a problem with uploading data to the store. I created only 1 index for the whole application and the node name as the key. Several nodes in different ontologies have the same names. So when an ontology is uploaded to the store and has a node that its name has been already in the index (by previous ontologies) this node is not created in the graph of this ontology. My app can upload a number of ontologies and graphs are created successfully for each ontology in the store. However when the process uploads the 9th ontology the store does not respond and it seems busy with some internal process like looking up the node in the index or something else. I'm stuck and don't know the cause of the problem. Do you have any clue or suggestions? Appreciate your help! Regards, From: David Montag david.mon...@neotechnology.com To: Peter Neubauer peter.neuba...@neotechnology.com Cc: Neo4j user discussions user@lists.neo4j.org Sent: Wednesday, November 9, 2011 10:07 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Let's connect during the day tomorrow for a higher-bandwidth discussion. Do you have Skype? Thanks, David On Wed, Nov 9, 2011 at 1:23 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Andrew, this sounds like the RRD database in the server got broken. Could you delete data/rrd and start up again? Also, David is in your timezone and maybe can connect with you directly to look into this? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/- Öresund - Innovation happens HERE. On Wed, Nov 9, 2011 at 10:07 PM, andrew ton andrewt...@yahoo.com wrote: Hi Peter, I don't understand much your question. However, in my Restlet application I have a logging service using Java log to record all processes in my app. My problem is that I upload 20 ontologies and after serveral ontologies the Neo4J stops responding my REST request. It seems busy with some index lookup process. Consequently my app throws a NoHttpResponseException: The target server failed to respond. I'd like to see what causes the problem inside Neo4J. Unfortunately both messages.log and neo4j.x.x.log do not show run time processes. BTW, when I start up the server the neo4j.x.x.log shows INFO: Server started on [http://localhost:7474/] Nov 9, 2011 10:22:22 AM org.neo4j.server.logging.Logger log WARNING: java.lang.IllegalArgumentException: Bad sample time: 1320862942. Last update time was 1320862942, at least one second step is required at org.rrd4j.core.RrdDb.store(RrdDb.java:553) at org.rrd4j.core.Sample.update(Sample.java:197) at org.neo4j.server.rrd.RrdSamplerImpl.updateSample(RrdSamplerImpl.java:62) at org.neo4j.server.rrd.RrdFactory$1.updateSample(RrdFactory.java:109) at org.neo4j.server.rrd.RrdJob.run(RrdJob.java:43) at org.neo4j.server.rrd.ScheduledJob$1.run(ScheduledJob.java:41) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Nov 9, 2011 12:43:44 PM com.sun.jersey.api.core.PackagesResourceConfig init INFO: Scanning for root resource and provider classes in the packages: org.neo4j.server.webadmin.rest Nov 9, 2011 12:43:44 PM com.sun.jersey.api.core.ScanningResourceConfig logClasses INFO: Root resource classes found: class org.neo4j.server.webadmin.rest.MonitorService class org.neo4j.server.webadmin.rest.RootService class org.neo4j.server.webadmin.rest.JmxService class org.neo4j.server.webadmin.rest.ConsoleService Nov 9, 2011 12:43:44 PM com.sun.jersey.api.core.ScanningResourceConfig init INFO: No provider classes found. Nov
Re: [Neo4j] Neo4j REST server's log files
Hi Andrew, I don't see anything running in that thread dump. No threads are processing requests. Are you sure it's taking a long time, or is it maybe finished? Try capturing the thread dump exactly when you experience slowness. I'd also suggest bumping the initial heap to at least 512MB or something like that. How much RAM do you have? How much data are you inserting? Thanks, David On Thu, Nov 10, 2011 at 2:56 PM, andrew ton andrewt...@yahoo.com wrote: Hi David, I increased the memory settings in neo4j-wrapper.conf as below # Initial Java Heap Size (in MB) wrapper.java.initmemory=30 # Maximum Java Heap Size (in MB) wrapper.java.maxmemory=1024 However the same problem is still happening. I attach the thread dump to this mail. I appreciate it if you tell me what's wrong based on the thread dump. Thank you! From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 11:39 AM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Good to hear that you got the logging sorted out. Regarding the actual issues, it sounds like you're describing two different things. One is that you index ontology nodes in one global index and therefore run into conflicts. The other is that the upload appears to stall for some reason. Regarding the global index, did you intend to design the system that way, or do you really want a separate index for each ontology? It sounds like that would be reasonable. As for the stalled upload, a thread dump during the slow processing would be most helpful. On a Linux system, you can capture that by doing kill -3 pid on the Java process. It should then go to console.log. Thanks, David On Thu, Nov 10, 2011 at 8:21 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, Thank you for getting back to me! Finally I can make it work by changing the log level from INFO (by default) to FINEST in the logging.property. I have a question for you though. Currently my project have a problem with uploading data to the store. I created only 1 index for the whole application and the node name as the key. Several nodes in different ontologies have the same names. So when an ontology is uploaded to the store and has a node that its name has been already in the index (by previous ontologies) this node is not created in the graph of this ontology. My app can upload a number of ontologies and graphs are created successfully for each ontology in the store. However when the process uploads the 9th ontology the store does not respond and it seems busy with some internal process like looking up the node in the index or something else. I'm stuck and don't know the cause of the problem. Do you have any clue or suggestions? Appreciate your help! Regards, From: David Montag david.mon...@neotechnology.com To: Peter Neubauer peter.neuba...@neotechnology.com Cc: Neo4j user discussions user@lists.neo4j.org Sent: Wednesday, November 9, 2011 10:07 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Let's connect during the day tomorrow for a higher-bandwidth discussion. Do you have Skype? Thanks, David On Wed, Nov 9, 2011 at 1:23 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Andrew, this sounds like the RRD database in the server got broken. Could you delete data/rrd and start up again? Also, David is in your timezone and maybe can connect with you directly to look into this? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/- Öresund - Innovation happens HERE. On Wed, Nov 9, 2011 at 10:07 PM, andrew ton andrewt...@yahoo.com wrote: Hi Peter, I don't understand much your question. However, in my Restlet application I have a logging service using Java log to record all processes in my app. My problem is that I upload 20 ontologies and after serveral ontologies the Neo4J stops responding my REST request. It seems busy with some index lookup process. Consequently my app throws a NoHttpResponseException: The target server failed to respond. I'd like to see what causes the problem inside Neo4J. Unfortunately both messages.log and neo4j.x.x.log do not show run time processes. BTW, when I start up the server the neo4j.x.x.log shows INFO: Server started on [http://localhost:7474/] Nov 9, 2011 10:22:22 AM org.neo4j.server.logging.Logger log WARNING: java.lang.IllegalArgumentException: Bad sample time: 1320862942. Last update time was 1320862942, at least one
Re: [Neo4j] Neo4j REST server's log files
Can you describe how you see that it stops? Because the thread dump isn't showing anything of significance running. David On Thu, Nov 10, 2011 at 3:28 PM, andrew ton andrewt...@yahoo.com wrote: Hi David, The total size of files stored into the db when the problem happened was only 800KB. My RAM is 4G. I ran the kill command when the problem happened because it just stopped and I did not notice slowness. I will increase the heap to 512MB and test again. Thanks, Andrew From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 3:00 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, I don't see anything running in that thread dump. No threads are processing requests. Are you sure it's taking a long time, or is it maybe finished? Try capturing the thread dump exactly when you experience slowness. I'd also suggest bumping the initial heap to at least 512MB or something like that. How much RAM do you have? How much data are you inserting? Thanks, David On Thu, Nov 10, 2011 at 2:56 PM, andrew ton andrewt...@yahoo.com wrote: Hi David, I increased the memory settings in neo4j-wrapper.conf as below # Initial Java Heap Size (in MB) wrapper.java.initmemory=30 # Maximum Java Heap Size (in MB) wrapper.java.maxmemory=1024 However the same problem is still happening. I attach the thread dump to this mail. I appreciate it if you tell me what's wrong based on the thread dump. Thank you! From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 11:39 AM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Good to hear that you got the logging sorted out. Regarding the actual issues, it sounds like you're describing two different things. One is that you index ontology nodes in one global index and therefore run into conflicts. The other is that the upload appears to stall for some reason. Regarding the global index, did you intend to design the system that way, or do you really want a separate index for each ontology? It sounds like that would be reasonable. As for the stalled upload, a thread dump during the slow processing would be most helpful. On a Linux system, you can capture that by doing kill -3 pid on the Java process. It should then go to console.log. Thanks, David On Thu, Nov 10, 2011 at 8:21 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, Thank you for getting back to me! Finally I can make it work by changing the log level from INFO (by default) to FINEST in the logging.property. I have a question for you though. Currently my project have a problem with uploading data to the store. I created only 1 index for the whole application and the node name as the key. Several nodes in different ontologies have the same names. So when an ontology is uploaded to the store and has a node that its name has been already in the index (by previous ontologies) this node is not created in the graph of this ontology. My app can upload a number of ontologies and graphs are created successfully for each ontology in the store. However when the process uploads the 9th ontology the store does not respond and it seems busy with some internal process like looking up the node in the index or something else. I'm stuck and don't know the cause of the problem. Do you have any clue or suggestions? Appreciate your help! Regards, From: David Montag david.mon...@neotechnology.com To: Peter Neubauer peter.neuba...@neotechnology.com Cc: Neo4j user discussions user@lists.neo4j.org Sent: Wednesday, November 9, 2011 10:07 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Let's connect during the day tomorrow for a higher-bandwidth discussion. Do you have Skype? Thanks, David On Wed, Nov 9, 2011 at 1:23 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Andrew, this sounds like the RRD database in the server got broken. Could you delete data/rrd and start up again? Also, David is in your timezone and maybe can connect with you directly to look into this? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/- Öresund - Innovation happens HERE. On Wed, Nov 9, 2011 at 10:07 PM, andrew ton andrewt...@yahoo.com wrote: Hi Peter, I don't understand
Re: [Neo4j] Neo4j REST server's log files
:46:52 PM DefaultMBeanServerInterceptor getAttribute FINER: Attribute= HeapMemoryUsage, obj= java.lang:type=Memory Nov 10, 2011 3:46:52 PM Repository retrieve FINER: name=java.lang:type=Memory Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: REQUEST /db/manage/server/monitor/fetch/1320942588 on org.mortbay.jetty.HttpConnection@66e90097 Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: Got Session ID 9fdtr96dq71c1iqw1b8416rpc from cookie Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@375e293a Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: session=null Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: servlet=org.neo4j.server.web.NeoServletContainer-1301864188 Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: chain=org.neo4j.server.statistic.StatisticFilter-org.neo4j.server.web.NeoServletContainer-1301864188 Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: servlet holder=org.neo4j.server.web.NeoServletContainer-1301864188 Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: call filter org.neo4j.server.statistic.StatisticFilter Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: call servlet org.neo4j.server.web.NeoServletContainer-1301864188 Nov 10, 2011 3:46:53 PM org.mortbay.log.Slf4jLog debug FINE: RESPONSE /db/manage/server/monitor/fetch/1320942588 200 and the server displays a request for next node in my ontology. But when I upload many ontology then when the problem happened the server keeps displaying blocks of request for fetch like FINE: REQUEST /db/manage/server/monitor/fetch/ on org.mortbay.jetty.HttpConnection@66e90097 Sorry for this long mail. I have no clue what causes the problem. Thank you very much, Regards From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 3:31 PM Subject: Re: [Neo4j] Neo4j REST server's log files Can you describe how you see that it stops? Because the thread dump isn't showing anything of significance running. David On Thu, Nov 10, 2011 at 3:28 PM, andrew ton andrewt...@yahoo.com wrote: Hi David, The total size of files stored into the db when the problem happened was only 800KB. My RAM is 4G. I ran the kill command when the problem happened because it just stopped and I did not notice slowness. I will increase the heap to 512MB and test again. Thanks, Andrew From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 3:00 PM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, I don't see anything running in that thread dump. No threads are processing requests. Are you sure it's taking a long time, or is it maybe finished? Try capturing the thread dump exactly when you experience slowness. I'd also suggest bumping the initial heap to at least 512MB or something like that. How much RAM do you have? How much data are you inserting? Thanks, David On Thu, Nov 10, 2011 at 2:56 PM, andrew ton andrewt...@yahoo.com wrote: Hi David, I increased the memory settings in neo4j-wrapper.conf as below # Initial Java Heap Size (in MB) wrapper.java.initmemory=30 # Maximum Java Heap Size (in MB) wrapper.java.maxmemory=1024 However the same problem is still happening. I attach the thread dump to this mail. I appreciate it if you tell me what's wrong based on the thread dump. Thank you! From: David Montag david.mon...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Thursday, November 10, 2011 11:39 AM Subject: Re: [Neo4j] Neo4j REST server's log files Hi Andrew, Good to hear that you got the logging sorted out. Regarding the actual issues, it sounds like you're describing two different things. One is that you index ontology nodes in one global index and therefore run into conflicts. The other is that the upload appears to stall for some reason. Regarding the global index, did you intend to design the system that way, or do you really want a separate index for each ontology? It sounds like that would be reasonable. As for the stalled upload, a thread dump during the slow processing would be most helpful. On a Linux system, you can capture that by doing kill -3 pid on the Java process. It should then go to console.log. Thanks, David On Thu, Nov 10, 2011 at 8:21 AM, andrew ton andrewt...@yahoo.com wrote: Hi David, Thank you for getting back to me! Finally I can make it work by changing the log level from INFO (by default) to FINEST
Re: [Neo4j] Neo4j REST server's log files
? On Nov 9, 2011 6:26 PM, andrew ton andrewt...@yahoo.com wrote: Hi Peter, I tried the messages.log but it only showed processes up to the time when the server is up 2011-11-09 08:59:19.669-0800: --- CONFIGURATION END --- 2011-11-09 08:59:19.733-0800: Extension org.neo4j.kernel.KernelExtension[kernel jmx] loaded ok 2011-11-09 08:59:19.857-0800: Extension org.neo4j.kernel.KernelExtension[shell] loaded ok 2011-11-09 08:59:19.858-0800: Extension org.neo4j.kernel.KernelExtension[kernel udc] loaded ok The manual shows that the logging is configured in conf/logging.properties (java.util.logging.FileHandler.pattern=data/log/neo4j.%u.%g.log). I'm using the default settings but the log file was not updated in the run time. My purpose is that I want to see what is wrong when I failed to store an ontology into Neo4J. Please help me out! Thanks and regards, From: Peter Neubauer peter.neuba...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Wednesday, November 9, 2011 8:46 AM Subject: Re: [Neo4j] Neo4j REST server's log files Andrew, The database logs are in data/graphdb / messages.log On Nov 9, 2011 5:39 PM, andrew ton andrewt...@yahoo.com wrote: Hi, What log files of the Neo4J REST server can I use to check what's going on in the server? The log/console.log and neo4j.x.x.log do not show transactions in the run time. I also looked into graph.db/tm_tx_log.x but I did not learn anything from there. Can somebody give me some pointers? Thanks, ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Relationships stored order
Hi Evgeny, Could you maybe describe the use case behind this requirement a bit more? Thanks, David On Sun, Oct 30, 2011 at 4:01 PM, Evgeny Gazdovsky gazdov...@gmail.comwrote: PS We don't need a traverse through relationships in stored order, only iterations for start or end node. -- Evgeny ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Relationships stored order
Do they need to be sorted because of some arbitrary reason, or because you're storing data structures like lists that you want to preserve the order of? David On Thu, Nov 3, 2011 at 4:44 PM, Evgeny Gazdovsky gazdov...@gmail.comwrote: 2011/11/4 David Montag david.mon...@neotechnology.com Hi Evgeny, Could you maybe describe the use case behind this requirement a bit more? We use the neo as persistent memory in the new age programming language. Every expression on this language is compiled into graph structure. So we need a graph with sorted relationships. And sort order is equal to the order in which relationships are created (stored). -- Evgeny ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node Id generation deadlock
= _graphDB.createNode(); // //node.setProperty(idseq, id); } private long generateId() { synchronized (_lock) { Long id; if (_factoryNode.hasProperty(idseq)) { id = (Long) _factoryNode.getProperty(idseq); } else { id = 1L; } _factoryNode.setProperty(idseq, id+1); return id; } } } (One last thing - I kept the line: System.out.println(echo from thread + Thread.currentThread().getName()); since if I remove it, the deadlock actually does not occur - at least not on my machine. However this is just a matter of race conditions, and if you use a higher number of threads / created-nodes-per-thread, then the problem will occur even without this line. I decided to keep it since I figured a case with two threads each creating two nodes is as simply as this can get). Thank you for your help, Ran. -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Node-Id-generation-deadlock-tp3473118p3473118.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Traversal performance
Also, try running it 100 times. Then you should see some JVM optimizations/JIT kick in. David On Mon, Sep 26, 2011 at 9:24 PM, Rick Devinsus rick.devin...@gmail.comwrote: That was it- the cache wasn't warmed. I tried running the same test twice, that increased the speed around 7x (450K traversals per second). Thanks for the help. -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Traversal-performance-tp3371038p3371546.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Some questions about design when using neo4j
On Tue, Aug 30, 2011 at 1:20 PM, Benjamin Gustafsson benjamingustafs...@gmail.com wrote: If an item can only be had by one user at a time, then the display, or visible_to, relationship could originate from the item. 9. owesItemTo -relation between User, Item, User/Group. (A triple, Do I need a node?) Same here. Thanks I will rename have to owner_of. Then use the relations item visible_to user/group item currently_held_by user/group Why would you need two relationships? Can you let a single relationship represent the mutual credit line between two users? If I use a single directed relationship, how could I prepare the data for the traversal to be fast? I know the traversal is equally fast in both directions, but the calculation remaining_credits can be made in advance or while traversing. I can't really decide if the pre calculated properties remaining_credits will make traversal faster or if it just introduces the unnecessary risk of inconsistency. I could have 5 properties: long balance long end_node_limit long start_node_limit long end_node_remaining_credits long start_node_remaining_credits My thoughts are something like this to make a faster traversal: If the traversal finds a incoming mutual_credit-relation it could check property end_node_remaining_credits to decide if it is enough for the transaction. And if the traversal finds a outgoing mutual_credit-relation it could check property start_node_remaining_credits to decide if it is enough for the transaction. Exactly. The direction would multiplex to two properties. That being said, it is OK to have two relationships as well. Since all modifications are in a transactional context, it will always be consistent. It will however require you to find two relationships instead of one, which, depending on how you implement it, may or may not require more time. David ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Getting sorted results from a traversal
-Bayes moviepilot GmbH | Mehringdamm 33 | 10961 Berlin | Germany Telefon +49 30 616 512 -110 | Fax +49 30 616 512 -133 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Pere Urbon-Bayes moviepilot GmbH | Mehringdamm 33 | 10961 Berlin | Germany Telefon +49 30 616 512 -110 | Fax +49 30 616 512 -133 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How to filter out a visited node in Cypher?
Hi, It sounds like you want to: return distinct x David On Mon, Jul 11, 2011 at 1:06 PM, noppanit noppani...@gmail.com wrote: Hi! I'm not sure how to do this in Cypher. Basically, I want to count the outgoing nodes but I don't know the relationships and they're growing. For examples, Node1---[rel1]--Node2 || |---[rel2]--| | [rel3] | Node2 I want to count all the nodes that connected with Node1 with whatever relationships. I'm using this query start n=(1) match (n)--(x) where return x X will be 3 because there are three relationships out of Node1, I only want x=2, because rel1, and rel2 is connected to Node2 as well. I'm not sure I'm explaining this well. Thanks a lot, I love this community. -- View this message in context: http://neo4j-user-list.438527.n3.nabble.com/How-to-filter-out-a-visited-node-in-Cypher-tp3160310p3160310.html Sent from the Neo4J User List mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Setting up a Cluster and querying
Hi Christian, Please see http://docs.neo4j.org/chunked/1.4.M06/ha.html for info on Neo4j HA. You can run a coordinator and a Neo4j server on the same machines. That's a common setup. As for how to query it, answering that requires some more explanation about how Neo4j can be run. Neo4j can be used in two deployment modes: embedded in a Java process, or stand-alone server. The server however internally runs an embedded instance. See http://docs.neo4j.org/chunked/1.4.M06/deployment-scenarios.html for more information on this. In an HA environment, a stand-alone server would be accessed over HTTP via the REST API[1]. You can also write custom extensions[2] in order to deploy Java code on the server so that you can build your own domain-specific query API. If you're not using the stand-alone server, but instead using embedded Neo4j in e.g. a web application deployed on Tomcat, then the API you expose from your webapp is completely up to you. Internally it then uses an embedded Neo4j instance, where you have full access to the Java API. In addition to these options, you can also use our new query language, Cypher[3]. You can try it out from the web administration interface of the stand-alone server. When setting up a Neo4j HA cluster, you typically also configure a load balancer in front of the cluster. The load balancer can use any method it desires to distribute the requests to the machines in the cluster. The load balancer is however not included in the Neo4j distribution -- it is something the user needs to provide. You could look into the Apache HTTP Server or HAProxy. Hope that answers some of your questions. David [1] http://docs.neo4j.org/chunked/1.4.M06/rest-api.html [2] http://docs.neo4j.org/chunked/1.4.M06/server-plugins.html, http://docs.neo4j.org/chunked/1.4.M06/server-unmanaged-extensions.html [3] http://docs.neo4j.org/chunked/1.4.M06/cypher-query-lang.html On Wed, Jul 6, 2011 at 11:22 AM, Christian Godde christian.go...@googlemail.com wrote: Hi there, I am quite a newbie with neo4j and I hope somebody can help me. I want to set up a Cluster with 6 Servers and a few Coordinators (can a Server at the same time be a Coordinator?). Theoretically the setting up of this cluster is more or less clear to me. But the big question for me is: How do I query this cluster? So that I don't communicate with a single server all the time, but the server with the lowest load at this time. I hope you know what I mean. Regards, Christian ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Hi Andrew, How big is your configured Java heap? It could be that all the nodes and relationships don't fit into the cache. David On Wed, Jul 6, 2011 at 8:03 PM, Andrew White li...@andrewewhite.net wrote: Here is some interesting stats to consider. First, I split my nodes into two groups, one node with 1.4M children and the other with 3.4M children. While I do see some cache warm-up improvements, the transversal doesn't seem to scale linearly; ie the larger super-node has 2.4x more children but takes 17x longer to transverse. neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 1468486 | +--+ 1 rows, 25724 ms neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 1468486 | +--+ 1 rows, 19763 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 3472174 | +--+ 1 rows, 565448 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 3472174 | +--+ 1 rows, 337975 ms Any ideas on this? Andrew On 07/06/2011 09:55 AM, Peter Neubauer wrote: Andrew, if you upgrade to 1.4.M06, your shell should be able to do Cypher in order to count the relationships of a node, not returning them: start n=(1) match (n)-[r]-(x) return count(r) and try that several times to see if cold caches are initially slowing down things. or something along these lines. In the LS and Neoclipse the output and visualization will be slow for that amount of data. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/- Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net wrote: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] LuceneIndex IllegalArgumentException
to create almost 5000 new nodes in one transacwithout any problem. We dont do lookups when creating a new node. It's only when updating it breaks. Is something wrong with the way we are searching or is it a bug in LuceneIndex? Is there a workaround? Thank you. John ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Getting a few critical errors with Neo4J 1.3
Hey Rick, Got any stacktrace/cause to go with that? Thanks, David On Fri, May 20, 2011 at 7:23 AM, Rick Bullotta rick.bullo...@thingworx.comwrote: Under high load, with multiple threads writing and reading simultaneously, after a couple of hours, things start to melt down and we get: org.neo4j.kernel.impl.persistence.ResourceAcquisitionFailedException: TM encountered an unexpected error condition. Any thoughts? Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Question from Webinar - traversing a path with nodes of different types
Hi Vipul, Out of curiosity, what does process in this context mean? As Rick alludes to, you'd have some component performing the simulation using the domain objects and possibly a graph traversal. An example of an algorithm for this would be to walk the graph from 1, and whenever you find a branch, you split the walk. When you finish walking a branch (a point where more than one branch joins) you use some kind of synchronization to join the walks. Does this make sense? David On Wed, Apr 20, 2011 at 11:16 PM, Vipul Gupta vipulgupta...@gmail.comwrote: Hi David, Inputs are 1 and 6 + Graph is acyclic. domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5- domain.Server@6 - domain.Router@7 - domain.Router@8 - I want a way to start from 1, process the 2 path till it reaches 5 (say in a thread) process the 7 path till it reaches 5 (in another thread) then process 5 and eventually 6. the above step of processing intermediate path and waiting on the blocking point can happen over and over again in a more complex graph (that is there could be a number of loops in between even) and the traversal stops only we reach 6 I hope this makes it a bit clear. I was working out something for this, but it is turning out to be too complex a solution for this sort of traversal of a graph, so I am hoping if you can suggest something. Best Regards, Vipul On Thu, Apr 21, 2011 at 11:36 AM, David Montag david.mon...@neotechnology.com wrote: Hi Vipul, Zooming out a little bit, what are the inputs to your algorithm, and what do you want it to do? For example, given 1 and 6, do you want to find any points in the chain between them that are join points of two (or more) subchains (5 in this case)? David On Wed, Apr 20, 2011 at 10:56 PM, Vipul Gupta vipulgupta...@gmail.comwrote: my mistake - I meant 5 depends on both 3 and 8 and acts as a blocking point till 3 and 8 finishes On Thu, Apr 21, 2011 at 11:19 AM, Vipul Gupta vipulgupta...@gmail.comwrote: David/Michael, Let me modify the example a bit. What if my graph structure is like this domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5 - domain.Server@6 - domain.Router@7 - domain.Router@8 - Imagine a manufacturing line. 6 depends on both 3 and 8 and acts as a blocking point till 3 and 8 finishes. Is there a way to get a cleaner traversal for such kind of relationship. I want to get a complete intermediate traversal from Client to Server. Thank a lot for helping out on this. Best Regards, Vipul On Thu, Apr 21, 2011 at 12:09 AM, David Montag david.mon...@neotechnology.com wrote: Hi Vipul, Thanks for listening! It's a very good question, and the short answer is: yes! I'm cc'ing our mailing list so that everyone can take part in the answer. Here's the long answer, illustrated by an example: Let's assume you're modeling a network. You'll have some domain classes that are all networked entities with peers: @NodeEntity public class NetworkEntity { @RelatedTo(type = PEER, direction = Direction.BOTH, elementClass = NetworkEntity.class) private SetNetworkEntity peers; public void addPeer(NetworkEntity peer) { peers.add(peer); } } public class Server extends NetworkEntity {} public class Router extends NetworkEntity {} public class Client extends NetworkEntity {} Then we can build a small network: Client c = new Client().persist(); Router r1 = new Router().persist(); Router r21 = new Router().persist(); Router r22 = new Router().persist(); Router r3 = new Router().persist(); Server s = new Server().persist(); c.addPeer(r1); r1.addPeer(r21); r1.addPeer(r22); r21.addPeer(r3); r22.addPeer(r3); r3.addPeer(s); c.persist(); Note that after linking the entities, I only call persist() on the client. You can read more about this in the reference documentation, but essentially it will cascade in the direction of the relationships created, and will in this case cascade all the way to the server entity. You can now query this: IterableEntityPathClient, Server paths = c.findAllPathsByTraversal(Traversal.description()); The above code will get you an EntityPath per node visited during the traversal from c. The example does however not use a very interesting traversal description, but you can still print the results: for (EntityPathClient, Server path : paths) { StringBuilder sb = new StringBuilder(); IteratorNetworkEntity iter = path.NetworkEntitynodeEntities().iterator(); while (iter.hasNext()) { sb.append(iter.next()); if (iter.hasNext()) sb.append( - ); } System.out.println(sb); } This will print each path, with all entities in the path. This is what it looks like: domain.Client@1 domain.Client@1 - domain.Router@2 domain.Client@1 - domain.Router@2 - domain.Router@3 domain.Client@1
Re: [Neo4j] Question from Webinar - traversing a path with nodes of different types
Hi Vipul, Zooming out a little bit, what are the inputs to your algorithm, and what do you want it to do? For example, given 1 and 6, do you want to find any points in the chain between them that are join points of two (or more) subchains (5 in this case)? David On Wed, Apr 20, 2011 at 10:56 PM, Vipul Gupta vipulgupta...@gmail.comwrote: my mistake - I meant 5 depends on both 3 and 8 and acts as a blocking point till 3 and 8 finishes On Thu, Apr 21, 2011 at 11:19 AM, Vipul Gupta vipulgupta...@gmail.comwrote: David/Michael, Let me modify the example a bit. What if my graph structure is like this domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5- domain.Server@6 - domain.Router@7 - domain.Router@8 - Imagine a manufacturing line. 6 depends on both 3 and 8 and acts as a blocking point till 3 and 8 finishes. Is there a way to get a cleaner traversal for such kind of relationship. I want to get a complete intermediate traversal from Client to Server. Thank a lot for helping out on this. Best Regards, Vipul On Thu, Apr 21, 2011 at 12:09 AM, David Montag david.mon...@neotechnology.com wrote: Hi Vipul, Thanks for listening! It's a very good question, and the short answer is: yes! I'm cc'ing our mailing list so that everyone can take part in the answer. Here's the long answer, illustrated by an example: Let's assume you're modeling a network. You'll have some domain classes that are all networked entities with peers: @NodeEntity public class NetworkEntity { @RelatedTo(type = PEER, direction = Direction.BOTH, elementClass = NetworkEntity.class) private SetNetworkEntity peers; public void addPeer(NetworkEntity peer) { peers.add(peer); } } public class Server extends NetworkEntity {} public class Router extends NetworkEntity {} public class Client extends NetworkEntity {} Then we can build a small network: Client c = new Client().persist(); Router r1 = new Router().persist(); Router r21 = new Router().persist(); Router r22 = new Router().persist(); Router r3 = new Router().persist(); Server s = new Server().persist(); c.addPeer(r1); r1.addPeer(r21); r1.addPeer(r22); r21.addPeer(r3); r22.addPeer(r3); r3.addPeer(s); c.persist(); Note that after linking the entities, I only call persist() on the client. You can read more about this in the reference documentation, but essentially it will cascade in the direction of the relationships created, and will in this case cascade all the way to the server entity. You can now query this: IterableEntityPathClient, Server paths = c.findAllPathsByTraversal(Traversal.description()); The above code will get you an EntityPath per node visited during the traversal from c. The example does however not use a very interesting traversal description, but you can still print the results: for (EntityPathClient, Server path : paths) { StringBuilder sb = new StringBuilder(); IteratorNetworkEntity iter = path.NetworkEntitynodeEntities().iterator(); while (iter.hasNext()) { sb.append(iter.next()); if (iter.hasNext()) sb.append( - ); } System.out.println(sb); } This will print each path, with all entities in the path. This is what it looks like: domain.Client@1 domain.Client@1 - domain.Router@2 domain.Client@1 - domain.Router@2 - domain.Router@3 domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5 domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5 - domain.Server@6 domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5 - domain.Router@4 Let us know if this is what you looked for. If you want to only find paths that end with a server, you'd use this query instead: IterableEntityPathClient, Server paths = c.findAllPathsByTraversal(Traversal.description().evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (new ConvertingEntityPath(graphDatabaseContext, path).endEntity() instanceof Server) { return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } })); In the above code example, graphDatabaseContext is a bean of type GraphDatabaseContext created by Spring Data Graph. This syntax will dramatically improve in future releases. It will print: domain.Client@1 - domain.Router@2 - domain.Router@3 - domain.Router@5 - domain.Server@6 Regarding your second question about types: If you want to convert a node into an entity, you would use the TypeRepresentationStrategy configured internally in Spring Data Graph. See the reference documentation for more information on this. If you want to convert Neo4j paths to entity paths, you can use the ConvertingEntityPath class as seen above. As an implementation detail, the class name is stored on the node as a property. Hope this helped
Re: [Neo4j] Basic Node storage/retrieval related question?
Hi Karan, Are you using Spring Data Graph, or the native Neo4j API? David On Thu, Apr 21, 2011 at 10:21 AM, G vlin...@gmail.com wrote: I have a pojo with a field a. which i initialize like this Object a = 10; I store the POJO containing this field using neo4j.. When I load this POJO, I have a getter method to get the object Object getA() { return a; } *What should be the class type of a ? * I am of the opinion it should be java.lang.Integer but it is coming out to be java.lang.String I am assuming this is because of node.getProperty(... ) Is there a way I can get Integer object only. Also what all types can be stored ? thanks, Karan . ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Palo Alto release party!
Hi graphsters, If you are in the SF bay area, don't think you'll go unnoticed! We're expecting to see you at the Palo Alto release party next Monday (4/18). Beers are on us! I'll be the guy with the tan Neo4j shirt. The venue is Nola (www.nolas.com) at 535 Ramona St. Hoping to see you there! David -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo Server slow on frequent requests
Hi Dario, I just had a peek at the thread dump, and it appear that it was not captured during the frozen period. Is that correct? If captured when the system appears frozen, it will show information about what the threads are doing or waiting for. Thanks, David On Wed, Apr 13, 2011 at 2:50 AM, Dario Rexin dario.re...@xing.com wrote: Hi Tobias, I was already sending requests to the server in the last dump. Here is another, hopefully this one is more helpful. The longer I request data, the longer it takes for the server to answer. After some time it frequently freezes for up to several seconds without answering to any of the requests. https://gist.github.com/917283 Cheers, Dario Am 13.04.11 11:14 schrieb Tobias Ivarsson unter tobias.ivars...@neotechnology.com: Hi Dario, This dump looks perfectly fine, the expected threads are there, but they are all idle waiting for work. When I asked for a thread dump, I wanted one from when the server was under load and you experienced problems. Sorry for not being clear about that. Cheers, Tobias On Wed, Apr 13, 2011 at 10:34 AM, Dario Rexin dario.re...@xing.com wrote: Hey, Somehow my attached files always get deleted. Heres the dump: https://gist.github.com/917199 Cheers, Dario Am 13.04.11 10:30 schrieb Dario Rexin unter dario.re...@xing.com: Hi Tobias, Here's the thread dump you asked for. Thank you for taking a look at this. Cheers, Dario Am 12.04.11 22:16 schrieb Tobias Ivarsson unter tobias.ivars...@neotechnology.com: Hi Dario, Looking at that picture it is indeed clear that a number of threads are waiting for something. What is not shown is the more important information about *what* they are waiting for. I would love to get information like that in order to investigate the cause of the performance problem you are seeing. If you could send a thread dump instead of a screenshot that would be a lot more useful, since that would contain information about contention that I could actually analyze. The easiest way to get a thread dump is by sending the SIGQUIT signal (kill -3) to the JVM process running Neo4j. Cheers, Tobias On Tue, Apr 12, 2011 at 6:35 PM, Dario Rexin dario.re...@xing.com wrote: Hi all, Due to huge performance issues with some of our neo queries I profiled my calls on the neo server. The profiling shows, that up to 85% of the time the threads are waiting for other threads. I don¹t understand what¹s going on there. Hopefully someone with a deeper knowledge can help me. Am I doing something wrong, or is it normal, that most of the time the threads are blocking each other? Her is a screenshot, showing the results of my profiling: http://i.imgur.com/eIfam.jpg Thanks in advice, Dario ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Dario Rexin Junior Manager Engineering dario.re...@xing.com XING AG Gaensemarkt 43, 20354 Hamburg, Germany Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 98807 Exec. Board (Vorstand): Dr. Stefan Groß-Selbeck (Vorsitzender), Ingo Chu, Dr. Helmut Becker, Jens Pape Chairman of the Supervisory Board (Aufsichtsratsvorsitzender): Dr. Neil Sunderland Please join my network on XING: https://www.xing.com/profile/Dario_Rexin This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden and may be unlawful. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Dario Rexin Junior Manager Engineering dario.re...@xing.com XING AG Gaensemarkt 43, 20354 Hamburg, Germany Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 98807 Exec. Board (Vorstand): Dr. Stefan Groß-Selbeck (Vorsitzender), Ingo Chu, Dr. Helmut Becker, Jens Pape Chairman of the Supervisory Board (Aufsichtsratsvorsitzender): Dr. Neil Sunderland Please join my network on XING: https://www.xing.com/profile/Dario_Rexin This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden and may be unlawful. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology
Re: [Neo4j] EmbeddedReadOnlyGraphDatabase workings
Alfredas, When you say watch, do you mean poll the graph at some interval? What would the read-only client do? Thanks, David On Tue, Apr 5, 2011 at 5:26 AM, Alfredas Chmieliauskas al.fre...@gmail.comwrote: Dear all, we have the following situation: - 1 client is writing to the embedded db (writer) - 1 client would like to watch that (read-only) is that possible with the EmbeddedReadOnlyGraphDatabase? Currently it seems that the read-only db does not see the updates from the writer since its creation. It there a way to force refresh besides creating a new instance? Also are there any other/better ways to do that (1 writer, 1 reader) without going into the server mode? Thanks a lot, Alfredas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] EmbeddedReadOnlyGraphDatabase workings
Alfredas, A solution based on EmbeddedReadOnlyGraphDatabase would not be able to benefit from caching, as each refresh of the view would have to clear the caches. You could possibly achieve the solution you want by setting the config parameter cache_type=none for the read-only instance. Then it should not cache anything and always read from disk or OS cache. This would however yield degraded performance if you repeatedly read the same data. If your processing could benefit from caching, then you're better off creating a new instance, or manually clearing the caches. Or going with the HA-based solution that Jim and Mattias outlined. David On Tue, Apr 5, 2011 at 9:36 AM, Alfredas Chmieliauskas al.fre...@gmail.comwrote: Yes. The read only client would query the db at time intervals. Alfredas On Tue, Apr 5, 2011 at 5:40 PM, David Montag david.mon...@neotechnology.com wrote: Alfredas, When you say watch, do you mean poll the graph at some interval? What would the read-only client do? Thanks, David On Tue, Apr 5, 2011 at 5:26 AM, Alfredas Chmieliauskas al.fre...@gmail.com wrote: Dear all, we have the following situation: - 1 client is writing to the embedded db (writer) - 1 client would like to watch that (read-only) is that possible with the EmbeddedReadOnlyGraphDatabase? Currently it seems that the read-only db does not see the updates from the writer since its creation. It there a way to force refresh besides creating a new instance? Also are there any other/better ways to do that (1 writer, 1 reader) without going into the server mode? Thanks a lot, Alfredas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Adding Nodes, transactions and such...
Hi, Is this of help? http://wiki.neo4j.org/content/Transactions#Big_transactions David On Thu, Mar 24, 2011 at 12:11 PM, jisenhart jisenh...@yoholla.com wrote: Hi All, When I load data via an EmbeddedGraphDatabase - I need to stop the server, load the nodes and then restart the server. Is there a way to load the server while it is active? Are there ways to do intermittent transaction commits? I was thinking to call transaction.finish and then get a new transaction after x number of records processed? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IndexService - where is the util jar file??
Jeff, IndexService is part of an older API. Please see http://docs.neo4j.org/chunked/snapshot/indexing.html for the latest indexing API docs. David On Wed, Mar 23, 2011 at 3:40 PM, jisenhart jisenh...@yoholla.com wrote: Where are IndexService and LuceneIndexService located? I cannot find them in any of the neo4j-1.3.M04/lib jars. Thanks, Jeff ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Where is the beer?
um 11:50 schrieb Jim Webber: Hey Rick, It was a pleasure to meet you too. And this got me thinking - it would be great to meet more folks from this list, or to form user groups, or generally just get a beer and talk Neo4j graphs. Is there, for example, a strong London contingent on this list? I only know me and Nat Pryce so far. Anyone else care to get together in London? Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Graph design
Massimo, So just to understand your graph layout, you have: (UID) --PERFORMED_ACTION_ON-- (DOMAIN) (UID) --ACTION_TOOK_PLACE_FROM-- (IP) Is this correct? Could you elaborate a bit more on the use case, along with the queries you want to do on your data? Thanks, David On Wed, Mar 16, 2011 at 10:00 AM, Massimo Lusetti mluse...@gmail.comwrote: I remember to have read about some design smells but I cannot find it in the Design_Guide wiki so I post it here. I got IP addresses and uid (unique usernames), each uid performs actions on domains (kinda of urls). So I got a db with a small to medium number of Node for uid, IP and domains (with 1/2 properties each) but I have a lot of Relationships cause I create a Relationship between a domain Node and an uid Node (with properties of course) to represent an action taken by the user on that particular domain and the same apply to Relationships between IP and uid cause that represent that the action has taken place from that particula IP address. So I'll end up with far more Relationship the Node, let's say that for 26139 Nodes I got 6866630 Relationships and the number of Nodes will continue to grow with a far far lighter curve then Relationships. Do you think there's some design smell in my graph!? Thanks -- Massimo http://meridio.blogspot.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Graph design
Massimo, It sounds like certain PERFORMED_ACTION_ON and ACTION_TOOK_PLACE_FROM relationships are logically grouped/related. Is this a correct statement? If so, then you might want to consider something like: (UID) --TOOK_ACTION-- (ACTION) (ACTION) --TOOK_PLACE_FROM-- (IP) (ACTION) --WAS_PERFORMED_ON -- (DOMAIN) Could such a model possibly make reasoning about and querying the data easier? David On Wed, Mar 16, 2011 at 10:26 AM, Massimo Lusetti mluse...@gmail.comwrote: On Wed, Mar 16, 2011 at 6:17 PM, David Montag david.mon...@neotechnology.com wrote: Massimo, So just to understand your graph layout, you have: (UID) --PERFORMED_ACTION_ON-- (DOMAIN) (UID) --ACTION_TOOK_PLACE_FROM-- (IP) Is this correct? Could you elaborate a bit more on the use case, along with the queries you want to do on your data? Thanks, David Yep, I got: (NETWORK) -- (IP) -- (UID) -- (DOMAIN) Then I need to collect which actions (defined as properties of the rel between UID and DOMAIN) users has taken from which NETWORK and calculate statistics (which domains are most used, the IP more frequently used, the active UID in a period of time... and so on) and do some (for now limited) semantic analysis. Does this sounds good? Am I using neo4j the right way? Cheers -- Massimo http://meridio.blogspot.com -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Graph design
Massimo, If you'd like, I could skype with you later this afternoon (in 4-5 hours) and discuss it? David On Wed, Mar 16, 2011 at 11:01 AM, Massimo Lusetti mluse...@gmail.comwrote: On Wed, Mar 16, 2011 at 6:41 PM, David Montag david.mon...@neotechnology.com wrote: Massimo, It sounds like certain PERFORMED_ACTION_ON and ACTION_TOOK_PLACE_FROM relationships are logically grouped/related. Is this a correct statement? If so, then you might want to consider something like: (UID) --TOOK_ACTION-- (ACTION) (ACTION) --TOOK_PLACE_FROM-- (IP) (ACTION) --WAS_PERFORMED_ON -- (DOMAIN) Could such a model possibly make reasoning about and querying the data easier? David I'm not sure If I get it all and correct but... That would enormously complicate the logic that parse log and produce data ... But indeed it sounds to my hear like a pretty nice suggestion. I'm going to give it a try and see if that's feasible. Thanks a lot, will let you know -- Massimo http://meridio.blogspot.com -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Issue with lucene index
) at org.structr.core.node.TransactionCommand.execute(TransactionCommand.java:37) at org.structr.core.entity.AbstractNode.commit(AbstractNode.java:968) at org.structr.core.log.LogService.run(LogService.java:85) Caused by: javax.transaction.HeuristicMixedException: Unable to rollback --- error code in commit: -1 --- error code for rollback: 0 at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:664) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:586) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:105) at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:86) ... 3 more ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] doInternalRecovery takes a long time
Samuel, Can you try it with 1.3.M03, if possible? Thanks, David On Wed, Mar 9, 2011 at 6:13 PM, Samuel Feng okos...@gmail.com wrote: Dear list, I am developing a tomcat application in eclipse which has about 100,000 nodes using EmbeddedGraphDatabase (Neo4j version 1.2M05) Sometimes I took me a long time(more than half an hour) to doInternalRecovery, maybe because I shutdown the tomcat server directly. From the messages.log, I can find many many Injected two phase commit, *Can u tell me what cause this and how to prevent it?* Thu Mar 10 09:47:47 CST 2011: Opened [C:\home\heartwater\graph\\nioneo_logical.log.1] clean empty log, version=1 Thu Mar 10 09:47:47 CST 2011: Opened [C:\home\heartwater\graph\/lucene/lucene.log.1] clean empty log, version=0 Thu Mar 10 09:47:47 CST 2011: Opened [C:\home\heartwater\graph\/lucene-fulltext/lucene.log.1] clean empty log, version=0 Thu Mar 10 09:47:47 CST 2011: Non clean shutdown detected on log [C:\home\heartwater\graph\index/lucene.log.1]. Recovery started ... Thu Mar 10 09:47:47 CST 2011: [C:\home\heartwater\graph\index/lucene.log.1] logVersion=0 with committed tx=1 Thu Mar 10 09:47:48 CST 2011: Injected two phase commit, txId=2 Thu Mar 10 09:47:48 CST 2011: Injected two phase commit, txId=3 Thu Mar 10 09:47:48 CST 2011: Injected two phase commit, txId=4 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=5 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=6 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=7 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=8 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=9 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=10 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=11 Thu Mar 10 09:47:49 CST 2011: Injected two phase commit, txId=12 Thu Mar 10 09:47:50 CST 2011: Injected two phase commit, txId=13 Thu Mar 10 09:47:50 CST 2011: Injected two phase commit, txId=14 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance expectations for Neo4j.
Bård, Great to hear you're evaluating us for your solution. I have a couple of questions. First, how much RAM do you have in the machine, and how much heap are you allocating for the Java process? Peter's question about running it multiple times is also very relevant. Secondly, I'd like to understand your use case a bit more. You have an ACL style graph. Your query is then, given a user, give me all customer resources that the user has some kind of access to. How is this result set then used? I'd appreciate it if you could outline your use cases a bit more. For example, maybe you're trying to check whether a user has access to a certain resource. Retrieving all resources for that user would not be the most efficient way of doing that. If you could shed some more light on this, then maybe we can find a better way forward. Thanks, David On Mon, Mar 7, 2011 at 6:46 AM, Bård Lind bard.l...@gmail.com wrote: Hi! First really good work you are doing on the Neo4J project! Really appreciate the good level of documentation as well! Currently I'm running a POC for validating if we can use Graph and Neo4J when we want to filter/ACL user-access to resources. Our concept is much like your example here http://blog.neo4j.org/2010/02/access-control-lists-graph-database-way.html . We have Users, Profiles in one graph (left hand side), and then a Customer hierarchal graph on the right side. Access are controlled from a Profile to a given Customer Resource with Read, Write and/or Inherit. All is well functionally wise, though I have had slower response than I expected. Scenario: 1. The database is loaded with 175 000 nodes, and aprox the same number of relations. 2. From a single user, fetch all resources this user has access to. user - profile - security - graph of customers, at 1-6 levels. When the graph of customers has few nodes ,aprox 200 the response is 30 ms, which is good. When the graph returns 170 000 (of 175 000) nodes, the response is slow 7 seconds! This test was run on a laptop with SSD disk. Other things I have tried: - Running on ultra-fast Ubuntu: took 3,5 sec. Still way slower than I hoped for. - Setting up to prefer traversal speed from these pages: http://wiki.neo4j.org/content/Performance_Guide and http://wiki.neo4j.org/content/Configuration_Settings - Simpler Traverser and/or TraversalDescription filters. Still pretty slow. Facts: - Neo4j 1.3-M03 - Java 1.6.0_24 - Running the Embeded Neo4j server. - Currently 175 000 nodes, full implementation will have aprox 5 mill resources, largest sub-graph will have some 500 000 nodes. The real question is really if 3,5 sec for fetching 170k nodes is expected, or is it possible to tune retrieval to less than a second? Thanks in advance. Bård (or Bard for non Scandinavian letters :-) ) ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] limiting results
Charlie, As Michael duly pointed out, the traverser is a lazy iterator, so you can simply just pull the first N results from it. In order to get a random node you'd need to know the full result set. Alternatively you can make your traverser go in random directions and that way get a random result set. David On Mon, Mar 7, 2011 at 2:12 PM, charlie char...@avvo.com wrote: Is there a way to limit the number of results that are returned from a traverse? I have a traversal that returns thousands of nodes. Ideally I would like to get either a random set of those nodes, failing that I would be happy with the the first N nodes. Charlie White Avvo, Inc. 1218 Third Avenue, Suite 300 Seattle, WA 98101 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Spring Neo4jTemplate
Cedric, Thank you for this additional input. We have, based on this, scheduled two new features: https://jira.springsource.org/browse/DATAGRAPH-52 https://jira.springsource.org/browse/DATAGRAPH-53 Once DATAGRAPH-52 implemented, you should be able to use @RelatedTo for this instead. I would appreciate it if you would read the tickets and give us feedback on whether you think you would use this feature or not. Thanks, David On Wed, Feb 23, 2011 at 10:09 PM, cedric.hurst ced...@spantree.net wrote: Hi David and Michael, Firstly, I knew that field argument had to be there for a reason, I just couldn't figure it out from the examples. ;-) I think your explanation is exactly what I was looking for, Michael. I'll give it a shot tomorrow. And David, you are correct, I'm looking to create two queries, one to retrieve all B's related to A and another to retrieve all C's related to A. I somewhat obfuscated my use case for the sake of simplicity, but I'm actually investigating the use of a graph database to more flexibly drive conflict resolution in a rule engine, Drools in my case. In this model, each rule would have a representation as a node on the graph, and the conditions would relate to specific fact nodes or patterns. So lets say we have something like this: Rule[name: 'red car costs one hundred dollars', price: 100] -- REQUIRES -- CarColor[name: red] Rule[name: 'sedan costs five hundred dollars', price: 500] -- REQUIRES -- CarTrim[name: sedan] These nodes have a parity in our rule engine so that when we evaluate a red car, its price gets set to 100; and when we evaluate a sedan its price gets set to 500. However, if I try to evaluate a red sedan, the price is ambiguous. Typically, rule engines provide a notion of salience, which would allow me to assign numeric priority to each rule. Whichever rule has the highest priority wins, and the loser gets evicted from the agenda. The problem with this is that eventually you end up writing QBASIC-like rules where one rule has a salience of 10 and another has a salience of 20. That way, if we have a rule that sits in between them, we can give it a salience of 25. This only goes so far, though, soI'm looking to do something a bit better. By specifying a graph hierarchy, I expect to be able to use node traversals to do things: 1. Discover conflicts by detecting any other rule nodes that do not have a requires constraint pointing to a node of the same type. For example, a rule for a red car and a rule for a green car are not in conflict; but if i define a rule for a red car and a rule for a sedan that doesn't also require some non-red color, I need to specify which rule takes priority should I hit an intersection of conditions. 2. Define mechanisms for rules to override or extend other rules. For example, if I wanted the sedan rule to take priority, I'd specify: Rule[name: 'sedan...'] -- OVERRIDES -- Rule[name: 'red car...'] Then, in the rules engine, I'd plug in a conflict resolution strategy take takes in an agenda of rules to fire and uses the graph to evict the rules that were overridden by other rules on the agenda. In the case of extension, I could have something like: Rule[name: 'red car...', price: 100] -- EXTENDS -- Rule[name: 'red sedan', price: 600] -- REQUIRES -- CarTrim[name: 'sedan'] This effectively defines a sub-rule for the intersection of a red car and a sedan, but the constraints would need to be collected through a node traversal, continue+excluding extends relationships and include+pruning from the requires relationships (CarTrim and CarColors). In order to accurately compile a rule from this model, though, I'd need to get a list of trims and color requirements separately (as they are distinct attributes of a Car fact). This is where the aforementioned A, B, C scenario comes into play. Of course, I could define separate relationship types for each attribute, e.g. REQUIRES_TRIM, REQUIRES_COLOR, but that just seemed like a lot of extra work. The nature of the relationship is the same, and ideally I'd like to derive the meaning from the Java type of the end node. It also allows us to add other types of constraint relationships in the future (like CANNOT_HAVE, etc). I'm just playing around with Neo4j for now, but so far I'm really liking what I've seen. On a separate, completely unrelated project, we were able to adapt fishhook style parent/child hierarchies from SQL to a graph and its opened up a whole new world of use cases for our dev team to explore. We're now able to navigate through company/organization hierarchies with way less code and much better response times. Keep up the good work! On Wed, Feb 23, 2011 at 7:42 PM, David Montag [via Neo4J User List] ml-node+2564675-719479922-366...@n3.nabble.com wrote: Cedric, Thank you for your feedback! We value it highly. I'm trying to understand your use case. You have entities of classes
Re: [Neo4j] Spring Neo4jTemplate
Ain't it beautiful? :) The internals of the implementation haven't been decided on yet, but it will be as efficient as possible. I'll keep you posted on the progress. David On Mon, Feb 28, 2011 at 1:33 PM, cedric.hurst ced...@spantree.net wrote: This is why I love open source! 52 will essentially give me the filter I'm currently doing manually, but in a much more expressive manner. I'm assuming the filter will be post-traversal? I'm not certain if there would be any performance advantage to doing it inside the traversal as an evaluator, but it was something I was pondering on my own. Just haven't had time to try it out. -- View this message in context: http://neo4j-user-list.438527.n3.nabble.com/Neo4j-Spring-Neo4jTemplate-tp2525460p2598691.html Sent from the Neo4J User List mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Forums ?
groups as well. Its a shame that google groups doesn't offer forums yet (or maybe they do and I just don't know about it). On Sun, Feb 27, 2011 at 7:55 AM, Andreas Kollegger andreas.kolleg...@neotechnology.com wrote: That's true. It is possible to search the mailing list. Do you prefer those interfaces to using google groups? -Andreas On Feb 27, 2011, at 1:52 PM, Anders Nawroth wrote: Hi! As stated here: http://neo4j.org/community/list/ you can search the mailing list archives here: http://www.mail-archive.com/user@lists.neo4j.org/info.html or here: http://www.listware.net/list-neo4j-user.html /anders 2011-02-27 08:30, Emilio Dabdoub skrev: Im almost sure that somebody asked this question before, but I didt not found a way to search the mail list :) Why neo4j does not have a Discussion forum? Im sure collaboration would boost ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Message: 4 Date: Sun, 27 Feb 2011 09:40:26 -0600 From: Cedric Hurst ced...@spantree.net Subject: Re: [Neo4j] Forums ? To: Neo4j user discussions user@lists.neo4j.org Message-ID: AANLkTi=ynoscnfzzbmb1jn1phoox420zerdoc8gk+...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Alex, Sorry, I meant to say its a shame that github doesn't yet offer forums. The hazards of posting before caffeine. On Sun, Feb 27, 2011 at 9:27 AM, Andreas Kollegger andreas.kolleg...@neotechnology.com wrote: A company that encourages people to go- ?-ogle does not seem entirely innocent. :P On Feb 27, 2011, at 4:16 PM, Alex Averbuch wrote: besides the lack of edit (which isn't so important) what does a forum have the google groups doesn't? ps, google does no evil On Sun, Feb 27, 2011 at 4:13 PM, Cedric Hurst ced...@spantree.net wrote: +1 for google groups as well. ?Its a shame that google groups doesn't offer forums yet (or maybe they do and I just don't know about it). On Sun, Feb 27, 2011 at 7:55 AM, Andreas Kollegger andreas.kolleg...@neotechnology.com wrote: That's true. It is possible to search the mailing list. Do you prefer those interfaces to using google groups? -Andreas On Feb 27, 2011, at 1:52 PM, Anders Nawroth wrote: Hi! As stated here: http://neo4j.org/community/list/ you can search the mailing list archives here: http://www.mail-archive.com/user@lists.neo4j.org/info.html or here: http://www.listware.net/list-neo4j-user.html /anders 2011-02-27 08:30, Emilio Dabdoub skrev: Im almost sure that somebody asked this question before, but I didt not found a way to search the mail list :) Why neo4j does not have a Discussion forum? Im sure collaboration would boost ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Spring Neo4jTemplate
Cedric, Thank you for your feedback! We value it highly. I'm trying to understand your use case. You have entities of classes A, B, and C. Your graph looks like this: A --RELATED_TO-- B A --RELATED_TO-- C And you want to provide query methods for getting all B's and C's for a given A. Correct? If yes, could you please elaborate a bit more on the actual use case? I.e. what are the A's and B's and C's. It would be interesting to know what kind of a use case would drive this graph and these queries. Thanks, David On Wed, Feb 23, 2011 at 2:10 PM, cedric.hurst ced...@spantree.net wrote: Firstly, thanks for your great work on the Spring Data Graph API so far. I took a look at the draft you had out here: https://gist.github.com/835408 And it looks like an awesome start. I'm relatively new to Neo4J (saw the SpringOne keynote and the roo talk), but one thing I think would be very useful is if there was some sort of higher-level PathMapper that worked at the GraphBacked level, instead of working at the Neo4j node level. In my immediate case, I have three NodeBacked types in my graph: TypeA, TypeB, and TypeC. Types B and C can be related to Type A through the same relationship type, but I want to define two traversals that include only each type respectively. I'm using the FieldTraversalDescriptionBuilder but its falling down when I try to do something like the following: class TypeA { @GraphTraversal(traversalBuilder = RelatedTraversalBuilder.class, elementClass = TypeB.class) private IterableTypeB typeBs; @GraphTraversal(traversalBuilder = RelatedTraversalBuilder.class, elementClass = TypeC.class) private IterableTypeC typeCs; private static class RelatedTraversalBuilder implements FieldTraversalDescriptionBuilder { public TraversalDescription build(NodeBacked start, Field field) { return new TraversalDescriptionImpl().relationships(RelTypes.RELATED_TO); } } } In this particular case, the traversal will return a mix of TypeB and TypeC and throw a class cast exception. I'd love to have some abstraction of a Path such that path.startNode() and path.endNode() would return the actual NodeBacked classes themselves, so I could write my PathMappers to use something like: class MyPathMapper implements PathMapper { public Void mapPath(GraphPath path) { if(path.endNode() instanceof TypeB) { return Evaluator.INCLUDE_AND_PRUNE } else { return Evaluator.EXCLUDE_AND_PRUNE } } } It seems like the template should be able to provide this with something along the lines of: interface PathMapperT { ... @Override public Void mapPath(GraphPath graphPath) { eachPath(graphPath); return null; } } interface GraphPath { NodeBacked startNode(); NodeBacked endNode(); ... } The key here is, of course, that GraphPath provides NodeBacked classes instead of primitive neo4j nodes. Beyond the simple instanceof example I gave, I think it would also be useful for evaluating paths using on NodeEntity properties rather than having to call node.getProperty('somePropertyNameThatIWillProbablyMisspell'). Not sure if my request makes sense, or is reasonable to implement, but I think it would certainly make my life a lot easier. -- View this message in context: http://neo4j-user-list.438527.n3.nabble.com/Neo4j-Spring-Neo4jTemplate-tp2525460p2563548.html Sent from the Neo4J User List mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How to query based on properties
Agam, Depending on the set of possible values, you could represent the properties with relationships instead. A unique property value can then be represented by a node, which would be linked to all nodes that have that value. The relationship type could indicate the property. The value nodes would then be indexed so that you can find the right node when setting the property (i.e. creating a relationship to the value node). Also, it would be great if you could elaborate a bit more on the actual use case behind this algorithm. That way, a more suitable solution might emerge, solving your problem in a different way. Thanks, David On Wed, Feb 23, 2011 at 10:36 PM, Agam Dua agam...@gmail.com wrote: Hey I'm a graph database and Neo4j newbie and I'm in a bit of a fix: *Problem Description* Let's say I have 'n' nodes in the graph, representing the same type of object. They have certain undirected links between them. Now each of these 'n' nodes has the same 10 properties, the *values* of which may differ. *Problem Statement* Take starting node A. I need to find a way to traverse all the nodes of the graph and print out which nodes have the most properties in common with A. For example, if A, C, D, E, F, G have 'x' properties in common I want to print the nodes. Then, I want to print the nodes which have 'x-1' properties with the same value. Then 'x-2', and so on. *Question* Now my question is, is this possible? If so, what would be the best way to go about it? Thanks in advance! Agam. * * ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Exception creating EmbeddedGraphDatabase from a Servlet
Pablo, Did you resolve this, or is this still an issue for you? David On Mon, Feb 21, 2011 at 2:08 AM, Pablo Pareja ppar...@era7.com wrote: Hi Jim, I already did it, I'm using a constant defined in the code for the DB folder which it's the same I used for testing things with a really simple jar. I also tried changing the permissions for every DB file granting every kind of permission to any kind of user (I know that's kind of crazy but just wanted to make sure it didn't have anything to do with that...) Pablo On Mon, Feb 21, 2011 at 11:00 AM, Jim Webber j...@neotechnology.com wrote: Hi Pablo, This caught my eye in your stacktrace: Unable to create directory path[] for Neo4j Can you confirm that you have provided the right path for your database into your Jetty app? Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Pablo Pareja Tobes LinkedInhttp://www.linkedin.com/in/pabloparejatobes Twitter http://www.twitter.com/pablopareja http://about.me/pablopareja http://www.ohnosequences.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IllegalStateException after successful commit
John, Are you doing tx.success() and tx.finish() when you commit? If so, then you likely already have an open transaction for that thread. If you start a new transaction while already in an open transaction, you will get a dummy transaction. It will not do anything when you commit it successfully, but if it fails, the parent transaction will fail. The data in the nested transaction only gets committed when the parent transaction commits. So if you've lost the reference to the top-level transaction, that's an issue. Also, if you'd like to share the code, I'd be glad to have a look at it. David On Wed, Feb 16, 2011 at 10:00 PM, John Howard johnyho...@gmail.com wrote: We found this strange behaviour with regard to transactions ( we use neo1.3-SNAPSHOT) Here are the steps in our application: 1. created,indexed and commited 10 nodes successfully 2. created, indexed and commited 40 nodes successfully 3. created, indexed and commited 900 nodes successfully 4. we found some app specific mistake in the step 3. So we removed indexes, deleted 900 nodes and commited successfully. 5. we tried to query those deleted nodes just to confirm whether delete was successful. 6. we were able to search for 900 (deleted)nodes from the index, and when we tried to access a property of a (deleted) node, it threw the following exception: java.lang.IllegalStateException: Node[7599] has been deleted in this tx at org.neo4j.kernel.impl.core.LockReleaser.getCowPropertyRemoveMap(LockReleaser.java:445) at org.neo4j.kernel.impl.core.NodeManager.getCowPropertyRemoveMap(NodeManager.java:898) at org.neo4j.kernel.impl.core.Primitive.getPropertyKeys(Primitive.java:99) at org.neo4j.kernel.impl.core.NodeProxy.getPropertyKeys(NodeProxy.java:129) at neopoc.data.util.graphManager.getNodesProperties(graphManager.java:771) 7. As a result of the above exception, it rolled back step 1 2 as well. So we lost all the nodes. My suspicion is, even though we committed successfullly in the steps 1, 2, 3, 4, they were never internally committed. May be some kind of differed/lazy commit and not immediate commit. Thank you for your assistance. - ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Does the index commit with the transaction?
Hi Massimo, I just want to understand your use case. You have a stream of records (log rows in your case) coming in. You process each record, somehow mutating the graph. Then you want to remember that you've already processed that record. If the same record arrives at some later point, you want to know that it has already been processed. If this is an accurate description, I'd like to know what kind of processing and mutating of the graph it is that you do. Maybe you could describe it? Thanks, David On Tue, Feb 15, 2011 at 1:56 PM, Massimo Lusetti mluse...@gmail.com wrote: On Tue, Feb 15, 2011 at 9:14 PM, Mattias Persson matt...@neotechnology.com wrote: 100 million sounds strange :) but to have a hand full of key/value pairs pointing to the same entity is rather normal. Could you elaborate more on that use case to let us know why you apparently have super many of those to the same entity? I need to elaborate a series of data coming in in a form of a log row inside a file. The data consist of action taken from users. The problem is very similar to having to parse squid log files which contains authenticated users. Each log file could arrive from more then one location (squid peering simile) and I've to be sure to not elaborate the data more then one time. So I though I could calculate an hash of each row and index it while storing the row's Node in the db. After a couple of tests it has been revealed that storing each row in the db is not feasible since it would occupy more then 700G of disk space since each node would consist of at least 800 bytes (node + properties) and I've 222 million nodes growing. Since what I really want to know is the existence of an hash within the index, which means I've already elaborated the unit of work (row), I though I could simply index each hash to the same node. I know it sounds strange cause it's the same to me but I cannot afford to store that amount of data, even more cause it doesn't contain the elaborated data yet. Any hints is really appreciated. -- Massimo http://meridio.blogspot.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] POJO best practice?
Hi Massimo, You can also check out our integration with Spring Data: http://www.springsource.org/spring-data If you scroll down you can find links to the docs and the github repo. This approach uses a simple POJO model with annotations, like JPA. It minimizes boilerplate code, and gives you a lot of stuff for free. There is also rudimentary cross-store support, for building applications that span across both a JPA datasource and Neo4j. Please let us know if you have any questions. David On Mon, Feb 14, 2011 at 6:56 AM, Massimo Lusetti mluse...@gmail.com wrote: Hi all, In almost all applications/examples/doc/wiki I've seen on neo4j.org the domain is based on POJO and this is somewhat usual but here I see you suggesting doing interface for POJO. Having an Actor interface implemented by an ActorImpl classes which is a POJO plus a reference to the underlying node. First why having an interface declaring a POJO just for getters/setters method, isn't this boilerplate code? Second since you do that, why not exposing the underlying node as a getter so you can just use Actor and never had to cast it to ActorImpl? Am I missing something? Cheers -- Massimo http://meridio.blogspot.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] getting non-deterministic results with getAllNodes()
Hi Raghava, Could you please provide the code used to create the store? Could you also please provide the code you use to iterate all nodes? The reference node always has id 0, so you can filter by ID to ensure that you don't process that node. David On Thu, Feb 3, 2011 at 10:41 AM, Raghava Mutharaju m.vijayaragh...@gmail.com wrote: Hi all, I want to iterate over all the nodes in the graph and then do a tranversal on each of them. To do this, I used the getAllNodes() method of GraphDatabaseService class. But the number of nodes I get always varies on each run. I checked the number of nodes I created during graph creation time and they were as expected. The results of getAllNodes() return each node many number of times. Nodes created: 413, Nodes obtained by getAllNodes(): 6506, 5426, 7021, 7434, .. What might be going wrong here? Another question I had is, it looks like a start/root node would be created by default. Is this the case? If so, how can I avoid/recognize it in the getAllNodes() results? Thank you. Regards, Raghava. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] getting non-deterministic results with getAllNodes()
Andreas' rain man style subtraction skills to the rescue :) Glad it worked out. Happy hacking! David On Thu, Feb 3, 2011 at 12:16 PM, Raghava Mutharaju m.vijayaragh...@gmail.com wrote: Hi David Andreas, Aah, I deleted the contents of the db and ran it, this time I got the expected result of 413 nodes. I was running the same program which creates traverses the graph multiple times, so the inconsistent results. Silly of me :) Thank you for pointing it out. Regards, Raghava. On Thu, Feb 3, 2011 at 1:54 PM, Andreas Kollegger andreas.kolleg...@neotechnology.com wrote: Hi Raghava, Also, are you sure you're only creating the nodes once? Looking at your numbers, (7434-7021=413) happens to be true, though the other intervals don't match. Is this from a single run starting with a clean database (the db directory is empty)? -Andreas On Feb 3, 2011, at 7:51 PM, David Montag wrote: Hi Raghava, Could you please provide the code used to create the store? Could you also please provide the code you use to iterate all nodes? The reference node always has id 0, so you can filter by ID to ensure that you don't process that node. David On Thu, Feb 3, 2011 at 10:41 AM, Raghava Mutharaju m.vijayaragh...@gmail.com wrote: Hi all, I want to iterate over all the nodes in the graph and then do a tranversal on each of them. To do this, I used the getAllNodes() method of GraphDatabaseService class. But the number of nodes I get always varies on each run. I checked the number of nodes I created during graph creation time and they were as expected. The results of getAllNodes() return each node many number of times. Nodes created: 413, Nodes obtained by getAllNodes(): 6506, 5426, 7021, 7434, .. What might be going wrong here? Another question I had is, it looks like a start/root node would be created by default. Is this the case? If so, how can I avoid/recognize it in the getAllNodes() results? Thank you. Regards, Raghava. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] strict class persistence vs. projections
Hi, I think that's a good idea. It makes good use of the schema-free nature of the graph database, and also decouples the data from the interpretation of the data. I'm not sure it has to be two different modes. Maybe you just use one projection of the data if that's all you need. Or you use multiple. Does this still imply one node per class (projection)? Are you aiming to change that too? David On Wed, Jan 19, 2011 at 3:40 AM, Michael Hunger michael.hun...@neotechnology.com wrote: Today Andreas Kollegger and I had an interesting discussion about the prevalence of class based mapping of entities to a graph store. One of the strengths of a graph store is that you don't need a strict schema for your data and you can use lots of different projections to work with it. Spring Data Graph currently focuses on a single projection of a node to an entity instance (1:1) that traditional ORMs focused on. But we could do more. We can project the node to many different classes, as long as the properties that are part of the class are there, we can sensibly work with the node. Even if the projection class is not part of the type hierarchy that was originally used to create and populate the node it can be used to access it. That makes room for some interesting things like: * new domain concepts can be used on top of existing data * get rid of inheritance hierarchies * traverse over a lot of nodes that support some basic properties that form a concept (e.g. Person) using that simple concept during the traversal and from there project those nodes to more concrete concepts as needed (e.g. Employee, Customer) * data/schema evolution / versioning We can run DATAGRAPH in a strict mode (not default) where it checks that the node requested always fit to the domain class specified (according to the type hierarchy stored in the graph). But we can (and should promote) running it in a more loosely coupled way where this free projection is possible. I would like to introduce a T T NodeBacked.projectTo(ClassT) method to the aspect so that this projection is easily available. Looking for feedback on that. Cheers Michael ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] SVN repo access problem
Yeah, so DNS works fine. It's just accessing the server that's not working for me: --- svn.neo4j.org ping statistics --- 8 packets transmitted, 0 packets received, 100.0% packet loss David On Sat, Jan 15, 2011 at 1:15 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Same here from Sweden. Seems to work. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sat, Jan 15, 2011 at 9:46 PM, Ivan Brusic i...@brusic.com wrote: I am using using Google DNS (8.8.8.8) as my DNS server (from the east coast) and I have no issues. $ nslookup svn.neo4j.org Server: 8.8.8.8 Address: 8.8.8.8#53 Non-authoritative answer: Name: svn.neo4j.org Address: 194.218.25.20 On Sat, Jan 15, 2011 at 2:59 PM, David Montag david.mon...@neotechnology.com wrote: Hi, For some reason I can't access svn.neo4j.org from California (i.e. not even ping). It works just fine from my Swedish server though. Anyone else experiencing anything similar, or is it just some freak outage? -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] SVN repo access problem
Already did, and they seem to get lost right before the Neo4j network. But traceroutes can lie. Anyway, checked with other US people, they can access it fine, so it's probably just my local network/computer at home that's screwy somehow. David On Sat, Jan 15, 2011 at 3:15 PM, Michael Hunger michael.hun...@neotechnology.com wrote: try traceroute to see where the packets get lost Sent from my iBrick4 Am 15.01.2011 um 22:26 schrieb David Montag david.mon...@neotechnology.com: Yeah, so DNS works fine. It's just accessing the server that's not working for me: --- svn.neo4j.org ping statistics --- 8 packets transmitted, 0 packets received, 100.0% packet loss David On Sat, Jan 15, 2011 at 1:15 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Same here from Sweden. Seems to work. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sat, Jan 15, 2011 at 9:46 PM, Ivan Brusic i...@brusic.com wrote: I am using using Google DNS (8.8.8.8) as my DNS server (from the east coast) and I have no issues. $ nslookup svn.neo4j.org Server: 8.8.8.8 Address: 8.8.8.8#53 Non-authoritative answer: Name: svn.neo4j.org Address: 194.218.25.20 On Sat, Jan 15, 2011 at 2:59 PM, David Montag david.mon...@neotechnology.com wrote: Hi, For some reason I can't access svn.neo4j.org from California (i.e. not even ping). It works just fine from my Swedish server though. Anyone else experiencing anything similar, or is it just some freak outage? -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Traverse API
Hi Ido, You make excellent points. The traversal API that you're referring to (Node.traverse) is however considered a legacy, stable API. What you are looking for is probably the newer, unstable API of the new traversal framework. You can read more about it here: http://wiki.neo4j.org/content/Traversal_Framework Using the traversal framework, you would typically use an Evaluator to decide whether or not to continue along a relationship. Calling path.lastRelationship() in the evaluator gives you the last relationship in the path so far. Please keep us posted on your progress, and don't hesitate to ask questions. It is after all an unstable API, and any feedback is appreciated. Let us know if you want more examples of usage. David On Wed, Jan 5, 2011 at 2:13 PM, Ido Ran ido@gmail.com wrote: Hi, I have a question about the traversal API: The API use ReturnableEvaluator to decide if a node should be selected by the traverser which leave place for developer to enter any arbitrary rules of which nodes to select. On the other hand the implementation return from Node allow to specify only RelationshipType and Direction to decide on which link to traverse. My question is why not have TraversableLink interface which will have boolean method isTraversable (or something like that) that will get the TraversePosition and Link and will decide if the link should be traversed. This way things like the number of links already traversed or properties of the link can be take into account. What do you think? Ido ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] help with traversals
Hi, Mattias, your solution works because Amos' algo was a special case where you can verify the path simply by looking one step back. It's a smart solution to his problem! However, what about traversals where you have more complex calculations going on and essentially want to carry a state along with you from hop to hop. That state would be different for each traversal branch. Is that something the traversal API should support? Community, WDYT? (sorry for thread hijack) David On Thu, Dec 23, 2010 at 12:48 PM, Mattias Persson matt...@neotechnology.com wrote: Hi, interesting traversal... so you're saying that paths like this could be returned: (start)-B-()-A[P=1]-()-A[P=1]-()-B-(end) but not paths like this: (start)-A[P=1]-()-B-()-A[P=2]-(end) am I correct? There are common path algorithms in GraphAlgoFactory http://components.neo4j.org/neo4j-graph-algo/apidocs/org/neo4j/graphalgo/GraphAlgoFactory.html but they don't support an evaluator as an argument, an evaluator which could look like: { public Evaluation evaluate( Path position ) { Relationship rel = position.lastRelationship(); if ( rel == null || !rel.isType( MyRelTypes.A ) ) return Evaluation.INCLUDE_AND_CONTINUE; Object p = rel.getProperty( P ); Object previousP = lookBackwardsForP( position ); if ( previousP != null !p.equals( previousP ) ) return Evaluation.EXCLUDE_AND_PRUNE; return Evaluation.INCLUDE_AND_CONTINUE; } private Object lookBackwardsForP( Path position ) { int count = 0; for ( Relationship rel : position.relationships() ) { // Skip the first one if ( count++ 0 ) { if ( rel.isType( MyRelTypes.A ) ) return rel.getProperty( P ); } } return null; } } You could probably create a breadth first traverser with such an evaluator to get your paths. I don't know about performance since each evaluation needs to go back one or more steps in the path, but you could try it out. 2010/12/23 Amos Ben Israel lucis.domi...@gmail.com Hello, I'm trying to find all paths between two nodes when relations can be of one of two types (A or B) without repeating the same node twice - so far quite simple. the complication is that relations of type A have a property P and I want only paths where P has the same value all the way (but can have two different values in two different paths). I tried to have a variable that is reset every time the traverse starts again from the beginning but find it difficult to know when did this happen. position.length() helps when I only have relations of type A - it does not ignore relations of type B which can be on the path. iterating over the path at each position works but seems inefficient - I'd like to hear ideas for a more efficient solution. Amos. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException exception
Hi George, If you would like to share your code, or really any code that reproduces this, along with the store directory, it would be easier for us to help you. David On Fri, Dec 17, 2010 at 5:00 AM, George Ciubotaru george.ciubot...@weedle.com wrote: Hello Johan, Unfortunately the proposed solution to lock the delete operation din't solve the issue. I'm still getting the InvalidRecordException exception but not as often. I'm getting instead a deadlock exception: Caused by: org.neo4j.kernel.DeadlockDetectedException: Transaction(113733)[STATUS_ACTIVE,Resources=0] can't wait on resource RWLock[Relationship[2221154]] since = Transaction(113733)[STATUS_ACTIVE,Resources=0] - RWLock[Relationship[2221154]] - Transaction(113733)[STATUS_ACTIVE,Resources=0] - RWLock[Relationship[2221154]] at org.neo4j.kernel.impl.transaction.RagManager.checkWaitOnRecursive(RagManager.java:219) at org.neo4j.kernel.impl.transaction.RagManager.checkWaitOnRecursive(RagManager.java:247) at org.neo4j.kernel.impl.transaction.RagManager.checkWaitOn(RagManager.java:186) at org.neo4j.kernel.impl.transaction.RWLock.acquireWriteLock(RWLock.java:300) at org.neo4j.kernel.impl.transaction.LockManager.getWriteLock(LockManager.java:129) at org.neo4j.kernel.impl.core.NodeManager.acquireLock(NodeManager.java:684) at org.neo4j.kernel.impl.core.RelationshipImpl.delete(RelationshipImpl.java:143) at org.neo4j.kernel.impl.core.RelationshipProxy.delete(RelationshipProxy.java:50) at Graphing.Graph.deleteRelationships(Graph.java:1257) I'm not using any lock mechanism other than the one you've proposed and that only for that delete operation. Any idea of might be wrong here? Thank you, George -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: 15 December 2010 13:32 To: Neo4j user discussions Subject: Re: [Neo4j] InvalidRecordException exception This will still happen in the 1.2.M05 release. I just wanted to make sure I linked the stacktrace's line numbers to the right part of the code since that exception being thrown at a different place in the delete method could mean there are other problems. -Johan On Wed, Dec 15, 2010 at 1:42 PM, George Ciubotaru george.ciubot...@weedle.com wrote: Yes, the version I'm currently using is 1.1. Shall I understand that this kind of issue shouldn't occur in 1.2.M05? For the moment I'll take the pessimistic approach by guarding against (as in the example you gave) to assure that this is the reason and then I'll just accept the exception. Thank you for your quick and detailed response. Best regards, George -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Johan Svensson Sent: 15 December 2010 12:23 To: Neo4j user discussions Subject: Re: [Neo4j] InvalidRecordException exception Sorry, should also have asked what Neo4j version you use but guessing it is 1.1 or early milestone release? If so I think the problem is caused by two or more concurrent transactions running delete on the same relationship. If two transactions get a reference to the same relationship and concurrently delete that relationship it is possible for a InvalidRecordException to be thrown instead of a NotFoundException since the write lock is grabbed after the relationship has been verified to exist. Solution is either to accept the exception or to guard against it by first acquiring a read lock on the relationship before invoking relationship.delete(). Code example how to do this: GraphDatabaseService gdb; // the graph db Relationship relationship; // the relationship to delete LockManager lockManager = ((EmbeddedGraphDatabase) gdb).getConfig().getLockManager(); lockManager.getReadLock( relationship ); try { relationship.delete(); } finally { lockManager.releaseReadLock( relationship ); } Regards, Johan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [SPAM] R-Tree indexing performance
Also, what kind of performance tuning of Neo4j did you do? Would you like to post your Neo4j config properties here? And maybe also an ls -lh (or Windows equivalent) of your database directory? And as Rick said, warming up the caches is crucial. David On Fri, Dec 17, 2010 at 8:54 AM, rick.bullo...@burningskysoftware.comwrote: Have you tried repeating the tests after the database has been warmed up (loaded into cache)? Original Message Subject: [SPAM] [Neo4j] R-Tree indexing performance From: Dave Hesketh [1]dave.hesk...@compassengine.com Date: Fri, December 17, 2010 9:43 am To: [2]u...@lists.neo4j.org I'm currently comparing the performance of R-Tree indexing in Neo4j with PostGIS/PostgreSQL. The database and index has been created and searched in Neo4j using Davide Savazzi routines : ShapefileImported and SearchWithin. The test dataset is 28,000 points (clustered around San Franciso and Vancouver) and the search is for the points within 1000 randomly generated 'circles' (ie 16 sided polygons). On average, each search in Neo4j takes 4 times longer than in PostGIS. Now I know the processing is working correctly I want to progressively increase the number of points to 10,000,000. Can anybody give me advice/tips on improving the performance in Neo4j before I start scaling-up the test? At this stage, I am only interested in the search performance. Neo4j Version: 1.2.M05 Environment: Windows 7, i5 64bit processor, quad core 4GB Thanks Dave ___ Neo4j mailing list [3]u...@lists.neo4j.org [4]https://lists.neo4j.org/mailman/listinfo/user References 1. mailto:dave.hesk...@compassengine.com 2. mailto:user@lists.neo4j.org 3. mailto:User@lists.neo4j.org 4. https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException exception
Hi George, Could you please provide some background info on how you created/populated your Neo4j graph? David On Tue, Dec 14, 2010 at 10:03 AM, George Ciubotaru george.ciubot...@weedle.com wrote: Hello guys, I'm getting org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: Record[558335] not in use exception from time to time when deleting a relationship. I don't seem to find much information about this kind of exception. Any idea what it means? Thank you, George ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Traversal framework suggested change
Fantastic! I have yet to try the implementation out, but I'm positive that it's an improvement. The only comment I have right now is the use of the word SKIP. IMO it is ambiguous with respect to stopping vs excluding. I prefer EXCLUDE. Will try it out soon. Thanks Mattias! David On Thu, Nov 18, 2010 at 3:46 PM, Mattias Persson matt...@neotechnology.comwrote: 2010/11/18 Mattias Persson matt...@neotechnology.com I just spent less than two hours making this change locally and everything works and it generally feels great. Now that I've tried it out myself, this way of controlling pruning/filtering feels more awesome. I'll post some examples soon so that you can feedback :) Well, examples are maybe unecessary. class Evaluator Evaluation evaluate(Path path); enum Evaluation INCLUDE_AND_CONTINUE INCLUDE_AND_STOP SKIP_AND_CONTINUE SKIP_AND_STOP class TraversalDescription +evaluator(Evaluator) -prune(PruneEvaluator) -filter(PredicatePath) Also I've added lots of useful evaluators in an Evaluators class, but maybe those should reside in Traversal class instead, however I think Traversal class is a little bloated as it is now. There's the decision whether or not this thing could go into 1.2 or not... For one thing it breaks the API, but then again the PruneEvaluator/PredicatePath (filter) can still be there, mimicing Evaluators in the background. Because a PruneEvaluator can be seen as an Evaluator which instead of true/false returns INCLUDE_AND_CONTINUE/INCLUDE_AND_STOP and a filter can be seen as an Evaluator which instead of true/false returns INCLUDE_AND_CONTINUE/SKIP_AND_CONTINUE. And you can have multiple evaluators just as you can with pruners/filters. This API seems more flexible and this will, in most cases, yield better traversal performance as well. -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Traversal framework suggested change
Hi all, Hopefully most of you are familiar with the traversal framework that was introduced in 1.1. It's powerful and provides for reusable traversal descriptions. It has some flaws though, and I would like to discuss one of them here. The traversal framework has this concept of pruning, which basically is an evaluation for each position, deciding whether or not to continue the traversal down this branch. The caveat here is that when you evaluate a position, you can't opt to prune before it. If you want to exclude a node based on information from that node, filtering has to be done on top of the pruning, with the same algorithm - once to stop the traversal, and once to exclude the node. So there are actually two orthogonal concepts at work here: whether to stop or not, and whether to include or not. What I'm proposing is to merge these two into one evaluator. That evaluator would return one of four values: CONTINUE_AND_INCLUDE_NODE, STOP_AND_INCLUDE_NODE, STOP_AND_EXCLUDE_NODE, or CONTINUE_AND_EXCLUDE_NODE. This would replace both the filtering and the pruning. I'm just throwing this out there to see if anyone else has had the same idea. Like / dislike? -- David Montag Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Lucene Analyzer
Hi, When using the Lucene index, a custom Analyzer implementation can be supplied in the config. It will only be used if it's a fulltext index. Here's my issue. I can't see where the analyzer is used when indexing a node or relationship. As far as I can tell, it's only used when querying the fulltext index. Shouldn't indexing also make use of the analyzer? David -- David Montag, Manager, Customer Support Neo Technology, www.neotechnology.com Cell: 650.556.4411 david.mon...@neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Relationship Check During Traversal
Hi Paddy, One idea is to prune the traversal by looking at whether the path so far already has a transfer relationship or not. You would then do some kind of filtering of the resulting paths, e.g. only accepting those with correct end nodes. I don't know if the computational complexity of this is acceptable or not though. And I don't know if this answer was relevant or not. I hope it was :) David On Sat, Sep 11, 2010 at 4:09 AM, Paddy paddyf...@gmail.com wrote: Hi just a quick question regarding the use of the PruneEvaluator I was wondering what would be the best way to modify the TraversalDescription in the Dijkstra algorithm in order to prune a traversal when a branch has reached a second transfer relationship. I want to avoid multiple transfers in a bus network. If the graph is arranged as: (stop:1) --bus (stop:2) --transfer (stop:3) --bus (stop:4) --transfer (stop:5) Is it possible to prune the traversal branch when the 2nd transfer relationship is reached after (stop:4) Could this be achieved using a PruneEvaluator? Or am I approaching this the wrong way? thanks Paddy ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] help ..Activity Streaming
Hi Joseph, The implementation of such a system would be highly dependent on your architecture. But my gut feeling is that activity feed data is kind of transient - it's not a big deal if it is lost. Its usage and access patterns are also different from the data it originated from. That leads me to want to keep that data separately. I like the approach with listening for events and putting them on a queue (#2). You can then process this queue asynchronously and add the event data to the appropriate user profile's rotating backlog of activity items. Usually it doesn't matter if the activity feed update lags a bit after the actual changes. An option is to keep the activity data in the same Neo4j instance as the data (#3), but it usually requires extra processing and uses resources that could have be used to serve actual data from the graph (serving content fast might be more important than having accurate activity feeds). Typically, the use of activity data is separate from the use of the actual data. So your queue processor could store it in a separate Neo4j store instead, where you can do whatever processing you need, e.g. aggregation so that you get a, b, and c like this rather than three separate likes updates. Now, as I said initially, it all depends on your architecture and needs. If you're Facebook, you might want to think a bit more about this :) What is your scope? Single server? I'm afraid I didn't understand your option #1. To answer your question about TransactionData, it's for one transaction. Check the javadocs for more information. David On Fri, Aug 27, 2010 at 9:14 AM, Joseph George josephgeor...@gmail.comwrote: Hi am working on social networking project and we are trying enable Activity Streaming [similar to facebook - wall posts/streaming or linkedin network updates ]. Wanted to check what is the correct option. Currently we see 3 options Option 1 : Enable activity log - This provides master list of all activities aka events. seen a file called active_tx_log under database directory : is there anything way we can enable log so that we can resue as Event Master as described above Questions: Currently it is empty. How to enable this so that we can resue this Event Master Option 2: Write own implementation of TransactionEventHandler and write to a event master as follows 1. Register TransactionEventHandler registerTransactionEventHandler SamTxEventHandler sambaashTxEventHandler = new SamTxEventHandler(); graphDatabase.registerTransactionEventHandler( SamTxEventHandler ); 2. Implement custom Event Handler : public class SamTxEventHandler implements TransactionEventHandlerObject 3. Monitor for property change or node creation/Update and push it external Event Master public void afterCommit( TransactionData data, Object state ) { for ( PropertyEntryNode entry : data.assignedNodeProperties()) {String key = entry.key(); Object value = entry.previouslyCommitedValue(); //**write to a master external Event Master file or mysql table with activity info and then run a scheduler that pickups .. ie pubsubhub.. **/ Question:TransactionData : is true for only current transaction. Will be persist history for all. Option 3: While searching came across : https://svn.neo4j.org/laboratory/users/mattias/activity-stream/src/main/java/org/neo4j/examples/activitystream/ActivityStreamExample.java : This example takes the approach of creating additional event Nodes and pushes it to a list of events. Apprecate if you can suggest the right approach for us to proceed. thanks and regards Joseph -- thanks and regards Joseph ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] java.io.IOException when starting REST server on Ubuntu 10.04 32 bit
Hi Todd, We would really appreciate it if you could file a ticket on https://trac.neo4j.org with all the info you can provide. Also, if you have the time, you are definitely encouraged to have a look at the source and submit a suggested patch. (see http://wiki.neo4j.org/content/About_Contributor_License_Agreement for more info) We are super busy right now, but we have not lost track of this. My suggestion is to take it as far as you can on your own first, put all the info in the ticket, and then we can look at it. Thank you. David On Tue, Aug 24, 2010 at 11:08 AM, Todd Chaffee t...@mikamai.com wrote: Any chance of getting some pointers on how to deal with this? On Mon, Aug 23, 2010 at 11:52 AM, Todd Chaffee t...@mikamai.com wrote: I'm getting an error when starting up the REST server on an Ubuntu 10.04 32bit box. Output of uname -a Linux ubuntu-server-base-v01 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 06:07:29 UTC 2010 i686 GNU/Linux I'm using the maven start script to run the REST server and here's the error I get: java.io.IOException: Invalid argument at sun.nio.ch.FileChannelImpl.map0(Native Method) at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:789) at org.neo4j.kernel.impl.transaction.xaframework.MemoryMappedLogBuffer.getNewMappedBuffer(MemoryMappedLogBuffer.java:77) at org.neo4j.kernel.impl.transaction.xaframework.MemoryMappedLogBuffer.init(MemoryMappedLogBuffer.java:46) at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.open(XaLogicalLog.java:238) at org.neo4j.kernel.impl.transaction.xaframework.XaContainer.openLogicalLog(XaContainer.java:90) at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:131) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:72) at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:147) at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:134) at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:98) at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:79) at org.neo4j.rest.domain.DatabaseLocator.getGraphDatabase(DatabaseLocator.java:31) at org.neo4j.rest.domain.DatabaseLocator.getConfiguration(DatabaseLocator.java:44) at org.neo4j.rest.GrizzlyBasedWebServer.init(GrizzlyBasedWebServer.java:26) at org.neo4j.rest.GrizzlyBasedWebServer.clinit(GrizzlyBasedWebServer.java:17) at org.neo4j.rest.WebServerFactory.getDefaultWebServer(WebServerFactory.java:9) at org.neo4j.rest.Main.main(Main.java:16) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:291) at java.lang.Thread.run(Thread.java:636) According to Sun/Oracle, the transfer length is too large for the OS. http://forums.sun.com/thread.jspa?threadID=5205184 For reference, it looks like the sizes are declared in GraphDbInstance on line 62, method getDefaultParms. Is there any way I can override these sizes from the command line when starting up the REST server or does this need to be changed in the source code? Todd ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] REST put over writing old properties
Hi Suhail, Could you explain the REST operation you're doing, what results you would expect from that operation, and what actually happens? David On Sun, Aug 22, 2010 at 9:11 AM, Suhail Ahmed suhail...@gmail.com wrote: Hi, i have been trying out the Neo4j REST interface and I found that PUT operation was replacing the existing properties of a node with a new one. This was happening on single values as well as multiple values. Is this a bug or am I doing something wrong here. I am using the REST plugin with Firefox. Cheers su./hail ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node loading question
Rick, After some further investigation Johan informed me that the property loading is actually smarter than I first told you. When you first access a node's or relationship's properties, all properties that fit into 8 bytes will have their data loaded immediately, i.e. booleans, ints, longs, etc. String and array data is however only loaded on demand for each property. Example: Node with properties: int count = 4 String name = foo float ratio = 1.2 int[] history = { 4, 3, 2, 1 } node.getProperty( countKey ); // all properties are loaded, but only data for count and ratio node.getProperty( nameKey ); // now name has been loaded and cached node.getProperty( historyKey ); // now history has been loaded and cached Hope this clears it up. David On Mon, Aug 16, 2010 at 12:30 AM, David Montag david.mon...@neotechnology.com wrote: Hi Rick, I believe that once an operation touches properties for the first time for a node or relationship, all properties are loaded from disk. They are then cached in memory. What are you storing in the properties? Not sure I understand the optimization you're trying to make - maybe you could explain it a bit more? David On Sun, Aug 15, 2010 at 6:29 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: When a node is accessed, are all of its properties loaded or are they lazily loaded as needed? I'm trying to decide whether to include a subset (but duplicated) properties on relationships to avoid loading the entire node if that is a concern. Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j, third party JTA Transaction M anager integration and potential bug.
Hi Francesco, If you want, you can send the patch directly to me, and we'll have a look at it this week. Thanks, David On Sat, Aug 14, 2010 at 1:49 PM, Degrassi Francesco francesco.degra...@emaze.net wrote: Hello everybody, We're evaluating Neo4j for integration in two products we are developing. One of the aspects we are interested in is making an embedded neo4j instance participate in JTA/XA transactions with a postgresql datasource and possibly JMS. I'm sorry if this is a bit long but the matter is complex. From what i understood after a bit of research, currently neo4j does not allow to use an external TransactionManager out of the box. I then dug into the source code and simply patched the following classes to allow me to pass an external JTA TransactionManager instance down to the TxModule, replacing Neo4j own TxManager. For our tests we used the following: * Neo4J 1.1 * Atomikos TransactionEssentials JTA Manager 3.6.5 * postgresql XA-enabled JDBC driver wrapped in an AtomikosDataSourceBean We created a simple test app which creates a very minimal Spring ApplicationContext which sets up the neo4j database, the JTA manager, the JDBC datasource and a sample transactional service which interacts with both the relational database and the graph database; we used Spring tx:annotation-driven infrastructure to handle begin/commit/rollback of the global transactions. Upon starting it, we got the following exception: org.neo4j.kernel.impl.transaction.LockNotFoundException: No transaction lock element found for Placebo tx for thread Thread[net.emaze.springexperiment.App.main(),5,net.emaze.springexperiment.App] at org.neo4j.kernel.impl.transaction.RWLock.releaseWriteLock(RWLock.java:345) at org.neo4j.kernel.impl.transaction.LockManager.releaseWriteLock(LockManager.java:202) at org.neo4j.kernel.impl.core.LockReleaser.releaseLocks(LockReleaser.java:337) at org.neo4j.kernel.impl.core.LockReleaser$ReadOnlyTxReleaser.afterCompletion(LockReleaser.java:713) at com.atomikos.icatch.jta.Sync2Sync.afterCompletion(Sync2Sync.java:91) at com.atomikos.icatch.imp.SynchToFSM.doAfterCompletion(SynchToFSM.java:38) at com.atomikos.icatch.imp.SynchToFSM.entered(SynchToFSM.java:59) at com.atomikos.finitestates.FSMImp.notifyListeners(FSMImp.java:197) at com.atomikos.finitestates.FSMImp.setState(FSMImp.java:288) at com.atomikos.icatch.imp.CoordinatorImp.setState(CoordinatorImp.java:498) at com.atomikos.icatch.imp.CoordinatorImp.setStateHandler(CoordinatorImp.java:328) at com.atomikos.icatch.imp.CoordinatorStateHandler.commit(CoordinatorStateHandler.java:730) at com.atomikos.icatch.imp.IndoubtStateHandler.commit(IndoubtStateHandler.java:225) at com.atomikos.icatch.imp.CoordinatorImp.commit(CoordinatorImp.java:828) at com.atomikos.icatch.imp.CoordinatorImp.terminate(CoordinatorImp.java:1127) at com.atomikos.icatch.imp.CompositeTerminatorImp.commit(CompositeTerminatorImp.java:151) at com.atomikos.icatch.jta.TransactionImp.commit(TransactionImp.java:298) at com.atomikos.icatch.jta.TransactionManagerImp.commit(TransactionManagerImp.java:612) at com.atomikos.icatch.jta.UserTransactionImp.commit(UserTransactionImp.java:168) at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:1028) at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:732) at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:701) at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:321) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:116) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at $Proxy9.success(Unknown Source) at net.emaze.springexperiment.App.main(App.java:21) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:291) at java.lang.Thread.run(Thread.java:619) Further analysis resulted in the following hypothesis. 1. the external JTA TransactionManager correctly completes the transaction 2. the TransactionManager then calls the Synchronization.afterCompletion() method on the LockReleaser$ReadOnlyTxReleaser for the (now finished) transaction. 3. This causes LockReleaser.releaseLocks() to be called, which through various
Re: [Neo4j] neo4j REST server configuration
11, 2010 at 10:55 PM, David Montag david.mon...@neotechnology.com wrote: Hi Brock, Sorry, I misread your e-mail, I thought you said compile time. I should at least have breakfast before answering any e-mails :) So, a runtime error. What library/class is missing? Could you provide us with the error, it would help. You can grab Jetty 6.1.25 and put it in lib, if they're not there. But they should be, if everything was installed correctly. mvn clean install in the REST component, and mvn clean package in the standalone component should do it. Please keep us updated on your progress. David On Thu, Aug 12, 2010 at 7:40 AM, David Montag david.mon...@neotechnology.com wrote: Hi Brock, Ok, that should have been taken care of by Maven, let me have a look at that. It should of course work to just mvn install:install-file them yourself into your repository. But I'll have a look at that. I'm free for gchat any time today if you want. David On Thu, Aug 12, 2010 at 12:29 AM, Brock Rousseau bro...@gmail.com wrote: Hey David, No worries about the disclaimer. I am getting a runtime error on startup though due to the lack of the Jetty libraries. Any special instructions there or should I just grab them from Jetty's website? Also, would any of you be available via gchat some time in the next 24 hours so I can relay the results of load testing? I can adjust my schedule since you guys are CEST if I'm not mistaken, just let me know. Thanks, Brock On Wed, Aug 11, 2010 at 2:46 PM, David Montag david.mon...@neotechnology.com wrote: Hi Brock, If you svn update to the latest version of the REST component, apply the patch I'll send to you, and rebuild it as per Jacob's previous instructions, then it should use Jetty instead. Keep in mind that this was a quick fix done today, so it might break down for the same or other reasons, especially as we haven't been able to reproduce the error you're seeing, and hence test that it actually fixes anything. Just a disclaimer. David On Wed, Aug 11, 2010 at 7:30 PM, Brock Rousseau bro...@gmail.com wrote: Hi Jacob, Would you be able to email me that patch? It's probably easier for me to throw it on our server and let you know how it goes rather than you guys having to try and reproduce it. Rough data for our server: ~1.5 billion relationships ~400 million nodes ~1,200 transactions per minute ~90% are lookups, 10% inserts Not sure if you're still around due to the time difference, but if you could provide that patch today I can test it right away. Thanks, Brock On Wed, Aug 11, 2010 at 9:22 AM, Jacob Hansson ja...@voltvoodoo.com wrote: So the current status is that David has got neo4j REST running on Jetty with all tests passing. We've also searched through the code, and found that there are no interrupt() calls in the jersey source, while there are a few on the grizzly side. There is one in particular that we have been looking at, related to keep-alive timeouts, that may be the culprit. If that was the problem, we've got a fix for it. We have, however, been unable to recreate the problem so far, so we can't tell if we've solved it or not :) Brock: could you give us an idea of what types of requests you were throwing at the server, and a rough estimate of how many? /Jacob On Wed, Aug 11, 2010 at 2:35 PM, Jacob Hansson ja...@voltvoodoo.com wrote: Hi all! Johan took a look at the stack trace, and explained the problem. What happens is that something, either the Grizzly server or the jersey wrapper calls Thread.interrupt() on one of the neo4j threads (which should be considered a bug in whichever one of them does that). This triggers an IOError deep down in neo4j, which in turn causes the rest of the problems. I'm working on recreating the situation, and David is working on switching the REST system over to run on Jetty instead of Grizzly. We'll keep you posted on the progress. /Jacob On Wed, Aug 11, 2010 at 1:51 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Nice, will try that out Jim! Grinder seems cool. Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer
Re: [Neo4j] neo4j REST server configuration
Hi Brock, If you svn update to the latest version of the REST component, apply the patch I'll send to you, and rebuild it as per Jacob's previous instructions, then it should use Jetty instead. Keep in mind that this was a quick fix done today, so it might break down for the same or other reasons, especially as we haven't been able to reproduce the error you're seeing, and hence test that it actually fixes anything. Just a disclaimer. David On Wed, Aug 11, 2010 at 7:30 PM, Brock Rousseau bro...@gmail.com wrote: Hi Jacob, Would you be able to email me that patch? It's probably easier for me to throw it on our server and let you know how it goes rather than you guys having to try and reproduce it. Rough data for our server: ~1.5 billion relationships ~400 million nodes ~1,200 transactions per minute ~90% are lookups, 10% inserts Not sure if you're still around due to the time difference, but if you could provide that patch today I can test it right away. Thanks, Brock On Wed, Aug 11, 2010 at 9:22 AM, Jacob Hansson ja...@voltvoodoo.com wrote: So the current status is that David has got neo4j REST running on Jetty with all tests passing. We've also searched through the code, and found that there are no interrupt() calls in the jersey source, while there are a few on the grizzly side. There is one in particular that we have been looking at, related to keep-alive timeouts, that may be the culprit. If that was the problem, we've got a fix for it. We have, however, been unable to recreate the problem so far, so we can't tell if we've solved it or not :) Brock: could you give us an idea of what types of requests you were throwing at the server, and a rough estimate of how many? /Jacob On Wed, Aug 11, 2010 at 2:35 PM, Jacob Hansson ja...@voltvoodoo.com wrote: Hi all! Johan took a look at the stack trace, and explained the problem. What happens is that something, either the Grizzly server or the jersey wrapper calls Thread.interrupt() on one of the neo4j threads (which should be considered a bug in whichever one of them does that). This triggers an IOError deep down in neo4j, which in turn causes the rest of the problems. I'm working on recreating the situation, and David is working on switching the REST system over to run on Jetty instead of Grizzly. We'll keep you posted on the progress. /Jacob On Wed, Aug 11, 2010 at 1:51 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Nice, will try that out Jim! Grinder seems cool. Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Aug 11, 2010 at 12:52 PM, Jim Webber j...@webber.name wrote: Perhaps something as simple as a Grinder script might help? Jim On 11 Aug 2010, at 17:57, Brock Rousseau wrote: Thanks Peter. Let us know if there is anything else we can provide in the way of logs or diagnosis from our server. -Brock On Tue, Aug 10, 2010 at 11:51 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Mmh, seems we should stresstest the server and Grizzly with e.g. http://www.soapui.org and see if we can reproduce the scenario, if there is no obvious hint to this. Will try to set it up ... Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Aug 11, 2010 at 4:14 AM, Brock Rousseau bro...@gmail.com wrote: The patch worked perfectly for increasing the concurrent transaction cap, but unfortunately exposed another issue. After increasing the load hitting our rest server, it performs smoothly for 10-15 minutes then begins issuing 500 responses on all transactions. When it happens, the number of open transactions freezes in JMX and the heap size essentially remains static. Below are the two stack traces we see in the wrapper.log. Here are what i think to be the relevant configuration lines: wrapper.conf: wrapper.java.additional.1=-d64 wrapper.java.additional.2=-server wrapper.java.additional.4=-Xmx8192m wrapper.java.additional.3=-XX:+UseConcMarkSweepGC
Re: [Neo4j] neo4j REST server configuration
Hi Brock, Ok, that should have been taken care of by Maven, let me have a look at that. It should of course work to just mvn install:install-file them yourself into your repository. But I'll have a look at that. I'm free for gchat any time today if you want. David On Thu, Aug 12, 2010 at 12:29 AM, Brock Rousseau bro...@gmail.com wrote: Hey David, No worries about the disclaimer. I am getting a runtime error on startup though due to the lack of the Jetty libraries. Any special instructions there or should I just grab them from Jetty's website? Also, would any of you be available via gchat some time in the next 24 hours so I can relay the results of load testing? I can adjust my schedule since you guys are CEST if I'm not mistaken, just let me know. Thanks, Brock On Wed, Aug 11, 2010 at 2:46 PM, David Montag david.mon...@neotechnology.com wrote: Hi Brock, If you svn update to the latest version of the REST component, apply the patch I'll send to you, and rebuild it as per Jacob's previous instructions, then it should use Jetty instead. Keep in mind that this was a quick fix done today, so it might break down for the same or other reasons, especially as we haven't been able to reproduce the error you're seeing, and hence test that it actually fixes anything. Just a disclaimer. David On Wed, Aug 11, 2010 at 7:30 PM, Brock Rousseau bro...@gmail.com wrote: Hi Jacob, Would you be able to email me that patch? It's probably easier for me to throw it on our server and let you know how it goes rather than you guys having to try and reproduce it. Rough data for our server: ~1.5 billion relationships ~400 million nodes ~1,200 transactions per minute ~90% are lookups, 10% inserts Not sure if you're still around due to the time difference, but if you could provide that patch today I can test it right away. Thanks, Brock On Wed, Aug 11, 2010 at 9:22 AM, Jacob Hansson ja...@voltvoodoo.com wrote: So the current status is that David has got neo4j REST running on Jetty with all tests passing. We've also searched through the code, and found that there are no interrupt() calls in the jersey source, while there are a few on the grizzly side. There is one in particular that we have been looking at, related to keep-alive timeouts, that may be the culprit. If that was the problem, we've got a fix for it. We have, however, been unable to recreate the problem so far, so we can't tell if we've solved it or not :) Brock: could you give us an idea of what types of requests you were throwing at the server, and a rough estimate of how many? /Jacob On Wed, Aug 11, 2010 at 2:35 PM, Jacob Hansson ja...@voltvoodoo.com wrote: Hi all! Johan took a look at the stack trace, and explained the problem. What happens is that something, either the Grizzly server or the jersey wrapper calls Thread.interrupt() on one of the neo4j threads (which should be considered a bug in whichever one of them does that). This triggers an IOError deep down in neo4j, which in turn causes the rest of the problems. I'm working on recreating the situation, and David is working on switching the REST system over to run on Jetty instead of Grizzly. We'll keep you posted on the progress. /Jacob On Wed, Aug 11, 2010 at 1:51 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Nice, will try that out Jim! Grinder seems cool. Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Aug 11, 2010 at 12:52 PM, Jim Webber j...@webber.name wrote: Perhaps something as simple as a Grinder script might help? Jim On 11 Aug 2010, at 17:57, Brock Rousseau wrote: Thanks Peter. Let us know if there is anything else we can provide in the way of logs or diagnosis from our server. -Brock On Tue, Aug 10, 2010 at 11:51 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Mmh, seems we should stresstest the server and Grizzly with e.g. http://www.soapui.org and see if we can reproduce the scenario, if there is no obvious hint to this. Will try to set it up ... Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype
Re: [Neo4j] neo4j REST server configuration
Hi Brock, Sorry, I misread your e-mail, I thought you said compile time. I should at least have breakfast before answering any e-mails :) So, a runtime error. What library/class is missing? Could you provide us with the error, it would help. You can grab Jetty 6.1.25 and put it in lib, if they're not there. But they should be, if everything was installed correctly. mvn clean install in the REST component, and mvn clean package in the standalone component should do it. Please keep us updated on your progress. David On Thu, Aug 12, 2010 at 7:40 AM, David Montag david.mon...@neotechnology.com wrote: Hi Brock, Ok, that should have been taken care of by Maven, let me have a look at that. It should of course work to just mvn install:install-file them yourself into your repository. But I'll have a look at that. I'm free for gchat any time today if you want. David On Thu, Aug 12, 2010 at 12:29 AM, Brock Rousseau bro...@gmail.com wrote: Hey David, No worries about the disclaimer. I am getting a runtime error on startup though due to the lack of the Jetty libraries. Any special instructions there or should I just grab them from Jetty's website? Also, would any of you be available via gchat some time in the next 24 hours so I can relay the results of load testing? I can adjust my schedule since you guys are CEST if I'm not mistaken, just let me know. Thanks, Brock On Wed, Aug 11, 2010 at 2:46 PM, David Montag david.mon...@neotechnology.com wrote: Hi Brock, If you svn update to the latest version of the REST component, apply the patch I'll send to you, and rebuild it as per Jacob's previous instructions, then it should use Jetty instead. Keep in mind that this was a quick fix done today, so it might break down for the same or other reasons, especially as we haven't been able to reproduce the error you're seeing, and hence test that it actually fixes anything. Just a disclaimer. David On Wed, Aug 11, 2010 at 7:30 PM, Brock Rousseau bro...@gmail.com wrote: Hi Jacob, Would you be able to email me that patch? It's probably easier for me to throw it on our server and let you know how it goes rather than you guys having to try and reproduce it. Rough data for our server: ~1.5 billion relationships ~400 million nodes ~1,200 transactions per minute ~90% are lookups, 10% inserts Not sure if you're still around due to the time difference, but if you could provide that patch today I can test it right away. Thanks, Brock On Wed, Aug 11, 2010 at 9:22 AM, Jacob Hansson ja...@voltvoodoo.com wrote: So the current status is that David has got neo4j REST running on Jetty with all tests passing. We've also searched through the code, and found that there are no interrupt() calls in the jersey source, while there are a few on the grizzly side. There is one in particular that we have been looking at, related to keep-alive timeouts, that may be the culprit. If that was the problem, we've got a fix for it. We have, however, been unable to recreate the problem so far, so we can't tell if we've solved it or not :) Brock: could you give us an idea of what types of requests you were throwing at the server, and a rough estimate of how many? /Jacob On Wed, Aug 11, 2010 at 2:35 PM, Jacob Hansson ja...@voltvoodoo.com wrote: Hi all! Johan took a look at the stack trace, and explained the problem. What happens is that something, either the Grizzly server or the jersey wrapper calls Thread.interrupt() on one of the neo4j threads (which should be considered a bug in whichever one of them does that). This triggers an IOError deep down in neo4j, which in turn causes the rest of the problems. I'm working on recreating the situation, and David is working on switching the REST system over to run on Jetty instead of Grizzly. We'll keep you posted on the progress. /Jacob On Wed, Aug 11, 2010 at 1:51 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Nice, will try that out Jim! Grinder seems cool. Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Aug 11, 2010 at 12:52 PM, Jim Webber j...@webber.name wrote: Perhaps something as simple as a Grinder script might help? Jim On 11 Aug 2010, at 17:57, Brock Rousseau wrote
Re: [Neo4j] Querying for nodes that have no relationhip to a specfic node
Alberto, Hope your testing is coming along well. Feel free to post your progress to the list! David On Wed, Jul 28, 2010 at 5:02 PM, Alberto Perdomo alberto.perd...@gmail.comwrote: Hi David, But then you need to store the result. You can store these metrics as relationships in neo4j, and then just update them for each user when you recompute. You can find the user nodes via indexing. Maybe it's acceptable that some metrics are out of date, so you can just background process them continuously. I already have background processes that go through all users and calculate new new pairs. But then in order to do that I do need to exclude the pairs I already have... because it would be silly and as the relationship density grows the probablity of calculating a pair again would be higher and higher... Would I be able to do that kind of query using indexing? Depending on your scenario, if your users know each other, it might be interesting to start computing in a foaf style order (breadth first). Remember, the power is in the relationships. Isolated nodes are not interesting. You mean I look first for possible pairs with users that are friends of friends instead of randomly? We are also interesting in storing friendship relationship so that sounds interesting. That would be a different type of query: Traverse the graph from node A to nodes which are friends of friends of A and have no match relationship with A. I guess that is not difficult to implement using Neo4j? Thanks for your input David! ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!
Hi Jeff, If I'm not mistaken, Neo4j loads all properties for a node or relationship when you invoke any operation that touches a property. As for the performance of traversals, it is highly dependent on how deep you traverse, and what you do during the traversal, so ymmv. Using a traverser is slower than doing getRelationships, as the traverser does extra processing to keep state around. Are you using Node#traverse() or the new traversal framework? Is your code available somewhere? Are you saying that checking whether there's a relationship between A and B takes over 20s? How many relationships do A and B have? What does your neo config look like (params)? Edge indexing might be a solution, you can look at the new indexing component for that. ( https://svn.neo4j.org/laboratory/components/lucene-index/) As for the incrementing of a property - while you're within a transaction, couldn't you increment a variable and then write that variable at the end of the transaction? David On Fri, Jul 30, 2010 at 8:10 PM, Jeff Klann jkl...@iupui.edu wrote: Hi, so I got 2GB more RAM and noticed that after adding some more memory map and increasing the heap space, my small query went from 6hrs to 3min. Quite reasonable! But the larger one that would take a month would still take a month. So I've been performance testing parts of it: The algorithm as in my first post showed *no* performance improvement on more RAM. But individual parts - Traversing only (first three lines) was much speedier, but still seems slow. 1.5 million traversals (15 out of 7000 items) took 23sec. It shaves off a few seconds if I run this twice and time it the second time, or if I don't print any node properties as I traverse. (Does Neo4J load ALL the properties for a node if one is accessed?) Even with a double run and not reading node properties, it still takes 16sec, which would make traversal take two hours. I thought Neo4J was suppposed to do ~1m traversals/sec, this is doing about 100k. Why? (And in fact on the other query it was getting about 800,000 traversals/sec.) Is one of Traversers vs. getRelationship iterators faster when getting all relationships of a type at depth 1? - Searching for relationships between A B (but not writing to them) takes it from 20s to 91s. Yuck. Maybe edge indexing is the way to avoid that? - Incrementing a property on the root node for every A B takes it from 20s to 61s (57s if it's all in one transaction). THAT seems weird. I imagine it has something to do with logging changes? Any way that can be turned off for a particular property (like it could be marked 'volatile' during a transaction or something)? I'm much more hopeful with the extra RAM but it's still kind of slow. Suggestions? Thanks, Jeff Klann On Wed, Jul 28, 2010 at 11:20 AM, Jeff Klann jkl...@iupui.edu wrote: Hi, I have an algorithm running on my little server that is very very slow. It's a recommendation traversal (for all A and B in the catalog of items: for each item A, how many customers also purchased another item in the catalog B). It's processed 90 items in about 8 hours so far! Before I dive deeper into trying to figure out the performance problem, I thought I'd email the list to see if more experienced people have ideas. Some characteristics of my datastore: it's size is pretty moderate for a database application. 7500 items, not sure how many customers and purchases (how can I find the size of an index?) but probably ~1 million customers. The relationshipstore + nodestore 500mb. (Propertystore is huge but I don't access it much in traversals.) The possibilities I see are: 1) *Neo4J is just slow.* Probably not slower than Postgres which I was using previously, but maybe I need to switch to a distributed map-reduce db in the cloud and give up the very nice graph modeling approach? I didn't think this would be a problem, because my data size is pretty moderate and Neo4J is supposed to be fast. 2) *I just need more RAM.* I definitely need more RAM - I have a measly 1GB currently. But would this get my 20day traversal down to a few hours? Doesn't seem like it'd have THAT much impact. I'm running Linux and nothing much else besides Neo4j, so I've got 650m physical RAM. Using 300m heap, about 300m memory-map. 3) *There's some secret about Neo4J performance I don't know.* Is there something I'm unaware that Neo4J is doing? When I access a property, does it load a chunk of properties I don't care about? For the current node/edge or others? I turned off log rotation and I commit after each item A. Are there other performance tips I might have missed? 4) *My algorithm is inefficient.* It's a fairly naive algorithm and maybe there's some optimizations I can do. It looks like: For each item A in the catalog: For each customer C that has purchased that item: For each item B that customer purchased:
Re: [Neo4j] Read-only transactions?
Hi Tim, It is not possible to mark a transaction as read-only. As Martin said, you don't need a transaction to perform read operations. If you have a transaction but don't do any writes, then it won't commit anything. So as long as you don't do any writes, you shouldn't experience any delays when finishing the transaction. David On Wed, Jul 28, 2010 at 8:38 PM, Martin Neumann m.neumann.1...@gmail.comwrote: If you use the latest development version of Neo4j you can do read operations without a transaction. Especially for huge number of reads this speeds things up allot. cheers Martin On Wed, Jul 28, 2010 at 4:53 PM, Tim Jones bogol...@ymail.com wrote: Hi, Is it possible to mark a transaction as being read-only? It's taking a while for my transaction to shut down, even though there are no writes to commit. Thanks, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!
Hi Jeff, Please see answers below. On Mon, Aug 2, 2010 at 5:47 PM, Jeff Klann jkl...@iupui.edu wrote: Thank you all for your continued interest in helping me. I tweaked the code more to minimize writes to the database and it now looks like: For each item A For each customer that purchased A For each item B (with idA) that A purchased Increment (in memory) the weight of (A-B) Write out the edges [(A-B):weight] to disk and clear the in-memory map This actually (if I'm not mistaken) covers all relationships and does 7500 items in about 45 minutes! Not too bad, especially due to (I think) avoiding edge-checking, and I think it's usable for my application, though it's still ~200k traversals/sec on average, which is a few times slower than I hoped. I doubt that's much faster than a two-table join in SQL, though deeper traversals should show benefits. Do you need to do this computation on the full graph all the time? Maybe it would be enough to do it once, and then update it when a customer buys something? Usually, high one-time costs can be tolerated, and with Neo4j you can actually do the updating for a customer performing a purchase at runtime without performance problems. - David, thank you for your answers on traversers vs. getRelationships and on property-loading. I imported some properties I don't really need, perhaps if I delete them it'll speed things up? Also I'm using the old Node.traverse(). How is the new framework better? I expect it has a nicer syntax, which I would like to try, but does it improve performance too? Well, depending on your setup you should be able to theoretically improve performance compared to the old traversal framework. The old framework keeps track of visited nodes, so that you don't traverse to the same node twice. This behavior is customizable in the new framework. Please see http://wiki.neo4j.org/content/Traversal_Framework and check the Uniqueness constraints. If you know exactly when to stop, then you should be able to use Uniqueness.NONE, meaning that the framework does not keep track of visited nodes, meaning that you could end up traversing in a cycle. In your network however, you might know that you always traverse (item) --BOUGHT-- (customer) --BOUGHT-- (item) --CORRELATION-- (item)* and no further than that, so then you know that you won't end up in a cycle. But yeah, then you need to programmatically make sure you don't go too far. And I don't know if this gives you any performance benefits what so ever. Also, as I understand it, all properties for a node are loaded when they are first touched. Then they're kept in memory, so if you update properties later on the same node, and it is still cached, it won't reread everything. - David, on checking relationships, I said checking 15 nodes for relationships to n other nodes (where n might be large, I'm not sure large, but 7500), takes 71s. The nodes are a highly-connected graph and also with edges going out to customers. So in the end the max edges for a node would be very high, up to around 7500 items and 300,000 customers. Just so I understand your data model: if a customer buys N products A1 - AN, will there be be a complete graph between the nodes A1 - AN? When in your algorithm do you need to check for the occurrence of a relationship between A and B? - Martin, I'm confused a bit about SSDs. I read up on them after I read your post. You said flash drives are best, but I read that even the highest performing flash drives are about 30MB/s read, whereas modern hard drives are at least 50MB/s. True SSDs claim to be 50MB/s too but they're quite expensive. So why is a flash drive best? I could definitely spring for one big enough to hold my db if it'd help a lot, but it has that slower read speed. Does the faster seek time really make that much of a difference? Any brands you'd recommend? I think the general consensus is that an SSD is usually the single best upgrade you can get for a computer or server. The blazingly fast seeks make all the difference. If you have a big file with data spread out over it and you need to read and write to different locations of the file rapidly, that means a lot of work for the heads in a conventional hard drive. The SSD nails this. Know when you start an application or do something processing heavy, and you hear your hard drive work? It's seeking. As for brands, I've heard good things about the Intel X25 ones. I have an SSD in my mac, but I don't know what brand it is. All I know is that it's ridiculously fast. David I will post some code snippets. Looks like there are a lot of sites for sharing codes snippets. Any recommendation? Thanks all, Jeff Klann On Mon, Aug 2, 2010 at 8:44 AM, David Montag david.mon...@neotechnology.com wrote: Hi Jeff, If I'm not mistaken, Neo4j loads all properties for a node or relationship when you invoke any operation that touches a property
[Neo4j] Querying for nodes that have no relationhip to a specfic node
Hi Alberto, Okay, interesting. You want to calculate some metric between pairs of users, so it's not a friend-of-a-friend scenario or anything like that, which would have been great in a graph db. This is just all/some pairs of random users. That you can do with your SQL db or neo4j or what ever db you want. But then you need to store the result. You can store these metrics as relationships in neo4j, and then just update them for each user when you recompute. You can find the user nodes via indexing. Maybe it's acceptable that some metrics are out of date, so you can just background process them continuously. Depending on your scenario, if your users know each other, it might be interesting to start computing in a foaf style order (breadth first). Remember, the power is in the relationships. Isolated nodes are not interesting. David -- Sent from cell, excuse typos. On Wednesday, July 28, 2010, Alberto Perdomo alberto.perd...@gmail.com wrote: Hi everyone, I would have an SQL db for the app besides the graph db. I have users that I would store as nodes within the graph besides storing them in SQL as well. Within those nodes I store attributes like male/female, age or date of birth, etc. I would have one kind of relationship for friendship, which doesn't present any kind of problem and I would do the standard type of queries neo4jr-social provides (e.g. friend suggestions, degrees of separation, friends in common, ...) We want to measure the compatibility/taste match/whatever between users in background, meaning for instance how much you have in common. This is done in Ruby. The result will be an integer between 0 and 100. BTW, this value is symmetric, meaning it could be modelled as a bidirectional relationship. Let's say I have 10k users and for every user I calculate the match between him and 10 other users. If I store all the results I calculate I potentially up to 100k relationships every day / 3m relationships every month. If I store this in SQL it can turn into a bottleneck very fast. The table will grow soon too big and the queries will be slower and slower. That's when I started thinking in storing those relationships in Neo4j because it's meant to handle a very large number of nodes and relationships really efficiently. I can model that as a relationship and either store the value inside the relationship or code the relationship names as 'match_high, match_medium, match_low' Now back to step 1. Selecting the users I'll be calculating new relationships with. They must match certain criteria, e.g. female/male, similar age, etc. and it could be pseudo random. Now the first step if you think in SQL is to query for all users that match the criteria and don't have a relationship with user A. And then yesterday looking at the Neo4j docs I thought this kind of query cannot be done. I could select all the users that match the criteria from SQL, then query all the relationships for A from Neo4j, substract those from the array of valid users and pick randomly n users. Because n is a low value, perhaps 10, this looks to me like a very inefficient way of doing this. Also it will be fast at the beginning but it will get slower as the relationship density grows with time... Maybe I should consider a different strategy. I've been also considering only storing high or interesting values but it would be more interesting to have the n top users for A ordered by relationship value. If I go ahead with this then I could just go and store it within SQL. This is not what we strive for but if I don't find a better way I'll guess we'll have to live with that. Also the solution I find should be easily scalable. It should also apply when having for instance 100k users. Any thoughts or comments? What would you recommend? Thanks for help guys! Alberto. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j Multiple Nodes Issue
Hi Maaz, Rick is on the right track with the UID generation. You need to make more than the ID generation thread safe though. Your first code snippet is obviously not thread safe. The second one uses double checked locking, and should be ok. You can also simply synchronize around the whole first snippet, or try Johan's suggested locking strategy. I'd recommend you to stay away from Neo4j's node IDs in this case, due to the reasons Rick stated. Now, regarding performance, there are a lot of factors here. Will your code serve requests? Over HTTP? If so, does the locking here really matter? I.e. is it washed out by the orders of magnitude greater network I/O times? If you're really concerned about performance, then you *will* want to do some kind of profiling. Point is, does the locking here matter in relation to other delays? Knowing where you should put your time optimizing is key. As to the performance of neoIndexService.getSingleNode(), I'm afraid I currently don't know. Maybe some of the other guys can help you out with this. Regarding your question about batching operations together in a single transaction versus doing them in different transactions, you can easily try it by writing a test. But just thinking about it, each mutating transaction has to hit disk, so it might cost you some I/O seeking doing different transactions, so I would count on it taking longer. How much longer, I can't say. And someone please correct me if I'm wrong here! David On Wednesday, July 21, 2010, Maaz Bin Tariq maaz.ta...@yahoo.com wrote: Thanks Johan Svensson and Rick Bullotta. Yes Bullotta, you are are right the node creation is the problem , our code is something similar to following code. we donot want to synchronized the method as it cost some performance. Any suggestion to improve it. Also how costly neoIndexService.getSingleNode() method is if we call it twice/thrice and the node was not created. Will it search the whole graph? Svensson, In our case the problem is creating of duplicate reference nodes that is even not handle in the sample code. --- private IndexService neoIndexService; private GraphDatabaseService neoService; private Node getOrCreateUserNodeByUserId(final Long id) { Node node = neoIndexService.getSingleNode(UID, id); if (node == null) { node = neoService.createNode(); node.setProperty(UID, id); neoIndexService.index(node, UID, id); } return node; } --- private Node getOrCreateUserNodeByUserId(final Long id) { Node node = neoIndexService.getSingleNode(UID, id); if (node == null) { node = neoService.createNode(); node.setProperty(UID, id); neoIndexService.index(node, UID, id); } return node; } -- how costly the following solution private Node getOrCreateUserNodeByUserId(final Long id) { Node node = neoIndexService.getSingleNode(UID, id); if (node == null) { node = createUserNode(id); } return node; } private synchronized Node createUserNode(final Long id){ Node node = neoIndexService.getSingleNode(UID, id); if (node == null) { node = neoService.createNode(); node.setProperty(UID, id); neoIndexService.index(node, UID, id); } return node; } Thanks -Maaz --- On Wed, 7/21/10, Johan Svensson jo...@neotechnology.com wrote: From: Johan Svensson jo...@neotechnology.com Subject: Re: [Neo4j] Neo4j Multiple Nodes Issue To: Neo4j user discussions user@lists.neo4j.org Date: Wednesday, July 21, 2010, 7:38 PM Hi, One can use the built in locking in the kernel to synchronize and make code thread safe. Here is an example of this: https://svn.neo4j.org/examples/apoc-examples/trunk/src/main/java/org/neo4j/examples/socnet/PersonFactory.java The createPerson method guards against creation of multiple persons with the same name by creating a relationship from the reference node. After the relationship has been created (in the transaction but not yet committed) the write lock for the reference node has been acquired making sure any other running transaction has to wait for the lock to be released. Finally the index is checked to make sure some other transaction did not create the person while the current transaction was waiting for the write lock. Even simpler is to just remove a non existing property from a node or relationship. That will grab a lock on the specific node or relationship (that will be held until the transaction commits or is rolledback). Regards, Johan On Wed, Jul 21, 2010 at 4:07 PM, Rick Bullotta
Re: [Neo4j] graph-matching from web application
Hi Jonathan, On Sat, Jul 17, 2010 at 2:56 AM, Jonathan Marten gurkensa...@gmx.de wrote: Dear all, I want to use the graph-matching component in the following way: a) user creates a subgraph via html form b) the ids of all matching subgraphs are retrieved via the graph-matching component in neo4j c) the ids are used to do some stuff in the web application (a wild mixture of Perl and PHP scripts) a) and c) exist and are currently still connected to PostgreSQL. b) works if I implement an example in Java, but what is the best way to connect these steps? I'm not sure I understand your setup. Could you describe a, b and c in more detail? What do you mean by a subgraph in this case? What makes it a subgraph, i.e. what is the greater graph? Does the REST API offer graph-matching capabilities? Currently, it does not. How can I make sure that several users can use the web application at the same time? (The database server and the web server are separate machines. The database does not change.) Well, this depends on how you roll it. If you have a separate database, then you will have to access it via e.g. REST or using the remote graph db API. But you can also have it embedded in your application, running in the webapp. But you might not be using a Java webapp? Best regards, Jonathan P.S.: I'm still waiting for answers on two questions I asked earlier, if anybody knows: 1) Is it possible to have the property id attached to a graph without connecting every node to an id-node? I don't understand what you mean. Please clarify. For example, you can't attach properties to the graph. Only to nodes and relationships. 2) Graph-matching: I still need clarification on matching subgraphs without knowing a node. Can this be done? Is it possible to match a subgraph when I know properties of nodes? Is it possible to match a subgraph when I only know relationships but nothing at all about the nodes? If not: How do I efficiently traverse the graph to find such subgraphs? What is your use case? What is it that you actually want to do? But to answer your questions: I think you always need to do matching starting from a node. You can match subgraphs with properties using the addPropertyConstraint method on PatternNode and PatternRelationship. You can match relationships too using the PatternRelationship class. David -- GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl. Bis zu 150 EUR Startguthaben inklusive! http://portal.gmx.net/de/go/dsl ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Very slow read performance-1sec to get a node's relationships
Hi Amir, In your previous e-mail you listed some sizes of store files: neostore.propertystore.db : 2GB neostore.propertystore.db.strings : 4GB You are however only memory mapping a fraction of the size of those files: neostore.propertystore.db.mapped_memory=100M neostore.propertystore.db.strings.mapped_memory=200M Try increasing these numbers to cover the whole files. Alternatively, you could also try using the auto config feature scheduled for 1.1. To use it, just make sure to not provide the memory mapping configurations you want auto configured, and they will automatically be set to sensible values. You need to be using 1.1-SNAPSHOT to get access to that feature though. Also, it shouldn't affect performance negatively in your case, but you have overdimensioned the mapped memory for the relationship store file. It is 1.5GB and you have mapped 5GB. In general it's good to map as much as you need. If you'd like, you can also run your test with dump_configuration=true in your config and then send us the output printed at startup. We're interested to hear how this works out for you. David On Sun, Jul 18, 2010 at 10:39 PM, Amir Hossein Jadidinejad amir.jad...@yahoo.com wrote: I have program that runs different threads in parallel. Each thread do the following job: for (IteratorRelationship itr = current_node.getRelationships(NodeRelationshipTypes.related_to) .iterator(); itr.hasNext();) { Relationship rel = itr.next(); Node out_node = rel.getOtherNode(current_node); String node_type = out_node.getProperty(type).toString(); Q.add(out_cui); } -just getting a set of output links. By the following command: java -d64 -server -XX:+UseNUMA -Xmx4096m -classpath $CLASSPATH:../lib/geronimo-jta_1_spec-1.1.1.jar:../lib/jline-0.9.94.jar:../lib/lucene-core-2.9.2.jar:../lib/mysql-connector-java-5.1.7-bin.jar:../lib/neo4j-index-1.1-20100714.135430-157.jar:../lib/neo4j-kernel-1.1-20100714.134745-137.jar:../lib/neo4j-remote-graphdb-0.7-20100714.140411-116.jar:../lib/neo4j-shell-1.1-20100714.140808-144.jar:../lib/servlet-api.jar:../lib/trove.jar:../lib/weka.jar:. myclass On a machine with 12GB memory. After running, disk is overloaded while heap memory is free! It takes roughly 1sec for each thread in order to get a list of outgoing links! The following is my configuration file: neostore.nodestore.db.mapped_memory=120M neostore.relationshipstore.db.mapped_memory=5G neostore.propertystore.db.mapped_memory=100M neostore.propertystore.db.strings.mapped_memory=200M neostore.propertystore.db.arrays.mapped_memory=0M After inspecting the application using jconsole, I found that the configuration settings are Unavailable (the snapshot is attached). Why my application is too slow? Would you please provide a list of TODO checks? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [RE] Отзыв о рассылке
Hi Lilia, The language of this mailing list is English, so please write all e-mails to it in English. Google Translate does however translate your e-mail to: First, hello! Newsletter works, why not? Hello to you too! What newsletter are you referring to? David 2010/7/18 Лилия Бакова liliabak...@mail.ru Екатерина, здравствуйте! Рассылка работает, почему нет? ___ Мои проекты: http://eligans.ru http://lan-electric.ru http://adivas.ru ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Searching within relationships
Hi Peter, Those use cases do sound feasible. The first one, finding all of 123's friends named Bob, you can do in different ways. You could iterate the relationships on 123 finding the friends and do the filtering/matching manually, or you could do a depth 1 traversal with some constraints, or, depending on the nature of the indexed property, find all nodes with name Bob and see if they are connected to 123. For the other use case, list 123's friends sorted by friend count, you can do a depth 1 traversal or manually iterate over the relationships of 123, and then use an intermediate data structure to do the sorting of the data you gathered in it. Indexing won't help you here, unless you store the number of friends as a precomputed or eventually computed property on the users. Marko's suggestions using Gremlin makes it a lot less verbose, but it's also good to know how to do it using the APIs. David Sent from cell, excuse typos etc etc On Saturday, July 17, 2010, Peter Soung pe...@sproutsocial.com wrote: Here's some additional context. I'm using Neo4j to store connections between different contacts in different network groups. As a simple example, User A may be connected to User B in Network A and Network B, but not Network C. This would be represented by 2 nodes (one for each user) and 2 relationships (one for Network A and one for Network B). Each user has several internal properties, such as name, network IDs, etc. With that said, I would like to be able to search within all of a user's contacts (i.e. anyone with a specified relationship). For example, I want to find anyone with the name = Bob that is connected to the user with ID = 123. You can assume that all of the aforementioned internal properties for a user are indexed. Another possible use case is this: show me anyone connected to the user with ID = 123 and sort those contacts by users with the most outgoing relationships. Hopefully that provides enough context to our use of Neo4j and the use cases we're looking to support. Do either/both of those scenarios sound feasible? Thanks, Peter Hi Peter, Just to understand the issue at hand, what does your graph look like? What problem do you want to solve? Do you have User nodes connected with some type of relationship, and want to find all users connected to a given user, who e.g. have an age property with a value greater than 30? Also, what are you indexing? David On Fri, Jul 16, 2010 at 2:44 PM, Peter Soung pe...@sproutsocial.com wrote: Hello, Is there a high-performance way to search/lookup users within the relationships of a given user? This is assuming that the relevant properties have been indexed. Thanks, Peter ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Transactions and creating new nodes
Hi Dmitrii, Separate threads can create nodes in different transactions, yes. You could lock around the user creation code you included in order to keep user creation atomic. David On Fri, Jul 16, 2010 at 10:59 AM, Dmitrii Dimandt dmitr...@gmail.comwrote: It's possible that I haven't looked to hard, but my question is this: A typical scenario for a site is to create a user. Before you create a user, you do this in a transaction: - begin transaction - check if such a user (node in neo4j) exists - if it exists, end transaction - return - if such a user doesn't exist, create a new user(node) - end transaction - return Can a separate thread create a new node while I'm in a transaction? If it can, how can I prevent that thread from creating a user with the user name I'm about to create? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Parallel Writing on a Node.
Hi Stefan, Just off the top of my head, I don't think there is a solution for this. The questions is, is it a problem? You have a first transaction that changes a property, and then a second transaction comes along and changes the same property. What behavior would you expect here? Couldn't your timing just be *slightly* off so that the first transaction *just* finishes before the second one overwrites the property, and then you'll be in the same situation without any write lock contention at all? As stated above, it's just off the top of my head. Please let me know if I misunderstood the problem. David On Fri, Jul 16, 2010 at 1:18 PM, Stefan Berndt kont...@stberndt.de wrote: Hi, I'm trying out neo4j for a while and want to see what happens if i write concurrent to a node. this is my test-case: public void foo() throws InterruptedException { final GraphDatabaseService db = ds.getGdb(); final CountDownLatch available = new CountDownLatch(2); final Runnable r1 = new Runnable() { @Override public void run() { Transaction t1 = db.beginTx(); try { Node n1 = db.getNodeById(1008001); System.err.println(TX1: + System.identityHashCode(n1)); n1.setProperty(name, NewName1); System.err.println(); sleep(200); System.err.println(Thread1name: + n1.getProperty(name)); System.err.println(n1.getProperty(name)); Node n2 = db.getNodeById(1008001); System.err.println(n2.getProperty(name)); t1.success(); } finally { t1.finish(); } sleep(100); Transaction t2 = db.beginTx(); try { Node n2 = db.getNodeById(1008001); System.err.println(n2.getProperty(name)); } finally { t2.finish(); } available.countDown(); } }; final Runnable r2 = new Runnable() { @Override public void run() { Transaction t2 = db.beginTx(); Node n1 = db.getNodeById(1008001); System.err.println(TX2: + System.identityHashCode(n1)); try { sleep(50); n1.setProperty(name, NewName2); t2.success(); } finally { t2.finish(); } available.countDown(); } }; ExecutorService pool = Executors.newFixedThreadPool(10); pool.execute(r1); pool.execute(r2); available.await(); Node node = db.getNodeById(1008001); System.err.println(Neuer Name= + node.getProperty(name)); Assert.assertEquals(NewName1, node.getProperty(name)); } private void sleep(int millis) { try { Thread.sleep(millis); } catch (InterruptedException e) { e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates. } } The Problem is, that the information of Runnable r1 is lost. R1 writes to the node, bute after that the writelock on the node is away and r2 writesto the same node. Is there a way to implement an Exception that is thron if something happens like this? A Notification if something like this happens would be also ok. Thank you for youzr help. Best Regards, Stefan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [Neo] Sharding
Hi Peter, There is no out-of-the-box sharding support right now. However, please see http://wiki.neo4j.org/content/External_Articles for a couple of links related to sharding. David On Wed, Jul 7, 2010 at 11:20 PM, Peter Soung pe...@sproutsocial.com wrote: Are there any updates about sharding a neo4j implementation? Thanks, Peter ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Searching within relationships
Hi Peter, Just to understand the issue at hand, what does your graph look like? What problem do you want to solve? Do you have User nodes connected with some type of relationship, and want to find all users connected to a given user, who e.g. have an age property with a value greater than 30? Also, what are you indexing? David On Fri, Jul 16, 2010 at 2:44 PM, Peter Soung pe...@sproutsocial.com wrote: Hello, Is there a high-performance way to search/lookup users within the relationships of a given user? This is assuming that the relevant properties have been indexed. Thanks, Peter ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Optimizing neo4j for traversal in Spring
Hi Paddy, You could let Spring coerce a Properties object into a MapString, String for the config. Basically just wire your graphDbService with an additional constructor arg that is the Properties object with your properties. You can construct this object using spring-util (PropertiesFactoryBean) something else. An alternative would be to use a PropertyPlaceholderConfigurer or similar and actually wire up the config constructor arg as a map with entries with keys matching the config properties and values that are resolved by the PropertyPlaceholderConfigurer (or similar). Something like this: map entry key=neostore.nodestore.db.mapped_memory value=${neostore.nodestore.db.mapped_memory} / /map This is just off the top of my head - there might be better ways to do it. David On Thu, Jul 15, 2010 at 1:28 PM, Paddy paddyf...@gmail.com wrote: hi all, I would like to integrate the optimizing neo4j for traversal settings in Spring using the example from the imdb spring app, the graphDbService is configured the in app-config.xml: bean id=graphDbService class=org.neo4j.kernel.EmbeddedGraphDatabase init-method=enableRemoteShell destroy-method=shutdown constructor-arg index=0 value=/home/neo/var/neo4j-db/ /bean I want to use the optimizing for traversals example settings, how can the configurations for the neo4j_config.props file be set if the GraphDatabaseService is injected by Spring? http://wiki.neo4j.org/content/Configuration_Settings#Optimizing_for_traversals_example The Configuration Settings wiki outlines the setup as : MapString,String configuration = EmbeddedGraphDatabase. loadConfigurations( neo4j_config.props ); GraphDatabaseService graphDb = new EmbeddedGraphDatabase( my-neo4j-db/, configuration ); but with the spring imdb example it is and auto-wired using the @Autowired annotation, can these settings be configured in spring?: @Autowired private GraphDatabaseService graphDbService; Thanks a lot Paddy ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Cleaning the DB programmatically
Hi Georg, If you could provide your code and Spring config, I could have a look at it and see if I can help you out. A git/hg/svn repo would be great. David On Mon, Jul 12, 2010 at 6:45 PM, Georg M. Sorst georgso...@gmx.de wrote: Hi Rick, admittedly I'm relatively new to Spring (and the whole Java thing) so there may well be better ways to do this. Anyway, the main reason for me to use Spring is the Dependency Injection / Inversion of Control. Unfortunately, this is keeping me from stopping, clearing and restarting Neo4j for each test as Spring is starting it up. Now of course I could set it all up manually for the tests but I like the convenience that DI offers. Also, I can use the same configuration I use in my actual application while only having to overwrite a few beans with mocks and stubs. Worth it? Best regards, Georg On 12.07.2010 15:41, Rick Bullotta wrote: Sounds like another good reason not to use Spring. ;-) In your case, Georg, it certainly sounds like it would be best to delete and recreate. In essence, that's what you're doing anyway. Can you help us understand what value you get from Spring in this scenario versus your own Neo lifecycle management code? Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Georg M. Sorst Sent: Monday, July 12, 2010 7:43 AM To: Neo4j user discussions Subject: Re: [Neo4j] Cleaning the DB programmatically Hi Peter, thanks for the reply. I can't say that it's not ok for me, can I? :) In my previous mail I mentioned errors such as Node[xyz] has been deleted in this TX. Turns out I had forgotten to clean all indices (because there does not seem to be a way to find out which indices exist currently). Anyway, the errors are gone so I'm fine for now. Just for reference, what I'm doing is: - graph.getAllNodes() - iterate over all the nodes - for each node delete all its relationships - delete the node - after deleting all nodes delete the indices Guess I didn't have to tell you that :) Still, thanks for picking up that item for your to-do. Thanks and best regards, Georg On 12.07.2010 09:30, Peter Neubauer wrote: Georg, normally people shut down Neo4j and delete the db directory in order to get rid of the entire db. However, I see your point in having a fast way to bulk delete the _content_ of the db. It's not totally simple to do that nicely, so Johan just wrote a note to look at this after Neo4j 1.1. It will probably be a helper doing this, but it needs closer examination. Would that be ok for you? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Jul 9, 2010 at 12:33 AM, Georg M. Sorstgeorgso...@gmx.de wrote: Hi list, is there a way to clean the DB and the indices programmatically? The reason I'm looking for this is that I'm using (and instantiating) Neo4j from Spring. This works just great, however in order to run my tests from a clean state I need a way to clean the DB and indices. Stopping Neo4j and deleting the store dir is not really a viable option as I would have to recreate the start-up sequence that I have already defined in my Spring context config. What I'm currently doing is iterate over the entire graph and delete all nodes and relationships. This works fine most of the time but sometimes gives me unexpected errors like Node[xyz] has been deleted in this TX. This seems to be especially true when I run all tests in succession. Another problem with that approach is that there doesn't seem to be a way to find out which indices currently exist making it hard to safely delete them all. So, any pointers would be greatly appreciated. Thanks and best regards, Georg ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Georg M. Sorst Dipl. Inf. (FH) http://vergiss-blackjack.de Ignaz-Harrer-Str. 13 / Top 19 5020 Salzburg Österreich Tel: +43 (0)650 / 53 47 200 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Import/export
Hi, This weekend I was toying around with Neo4j. I wanted to do some indexing experiments. Unfortunately I found myself without a graph to work with. Sure, I could write some code to generate a graph for me, but it'd be a one-time-thing. I wanted to get going *now*. That got me thinking about import/export functionality. I think a command-line import tool would be useful, accompanied by (and built on) a Java API. Both of them would be tied to a certain representation format. The export can be represented in different ways, where two possible ways are: - State transfer: (node{id:1, name:foo}, node{id:2}, rel{start:1,end:2, type=bar}, ...) - Operation transfer: (id1 = create node, id2 = create node, create rel id1-id2 type bar, ...) I guess the state transfer feels like the more straightforward one. The diff-style nature of the operation transfer might be useful in other cases. When I first thought of this, the target user was somebody who wanted to get started with a graph, and didn't want to write code to do an import manually. Maybe the import/export can extend to other use cases, but this was the primary one. A possible workflow could be db exported to file, file published, file downloaded, file imported into db. In the end, it would be great if new users could download sample data sets and import them into a Neo4j instance without writing a single line of code. Which also gets me thinking about a command-line tool to create an empty Neo4j instance to import into. The actual implementations of the tools are trivial. It's the discussion that leads to the implementation that's important. Does this sound like anything that would interest people? If so, (digging into details) what kind of representation do you guys think would be best? I was thinking XML, but a binary format might be better for performance (size/primitives ratio). Maybe both? Because I do like the idea of a human-readable (and editable) format. If you don't think it would be useful I would love to hear why. This is just a brain dump of my thoughts. Surely others have thought of this as well. I'm just getting the discussion started. WDYT? -David ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] XML data import strategy?
Hi Peter, I may be jumping in on this prematurely, but as I've understood it, the graph in Neo4j should be centered around your domain model. And generic XML nodes don't really map to any such model, unless you're specifically writing an XML database. So my gut feeling tells me that some utils making #2 easier would be more useful than #1. But ymmv, as always. -David On Tue, Dec 1, 2009 at 6:13 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi folks, I have come over the problem of importing data form XML files a lot lately. There are 2 approaches that emerged after discussions with Craig Taverner: 1. Write a generic utility: take the XML DOM tree and directly put it as-is into Neo4j, then later write code to connect the interesting nodes to your domain, maybe discard the DOM tree after that. 2. Write a specific domain parser: Filter out interesting info with e.g. XPath, then create only the relevant information as a graph in Neo4j. Now, 1) sounds like a generic import utility that would be very handy. OTOH, I wonder how common the problem is and what you guys would prefer, 1) or 2)? -- Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org- Relationships count. http://www.linkedprocess.org - Distributed computing on LinkedData scale ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] In-memory Neo
Hi, I'm developing a small application that is launched through Java Web Start. I want to use Neo for the node space features, but I don't need any persistence, as the application won't need to save any state between runs. What I'm wondering is, is there an in-memory NeoService implementation that doesn't access the file system at all? /David ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Testing the list after an upgrade
Disregard this message. ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Storage overhead
Hi everyone, I am thinking about storing binary data in Neo. When I ran some simple tests, there seemed to be considerable amounts of overhead. For example, when I stored 10MB of data (byte[10*1024*1024]) in a node property, var/nio increased in size from 3MB to 18MB. Does anybody have an explanation for this increase in disk usage? Does anyone have any other suggestions on how to store binary data in Neo? /David ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Mapping UUID to Neo node
Hey Philip, I'm no whiz at this, I just want to give you my two cents. I'm looking forward to hearing what the Neo developer team and the other community members have to say here too. As I understand it, Neo's primary usage is not for mapping one number to another (node ID), or numbers to objects, if you will. It is not where Neo shines. I would probably go with some kind of Map implementation. You could probably arrange the UUIDs in some kind of tree though. Not sure if this would be more effective than a java.util Map implementation. I guess the issue would be where to persist this data structure. That I'd like to know. /David PS. Is it possible to create a node with a specific node ID? DS. On Sun, Apr 27, 2008 at 9:59 AM, Philip Jägenstedt [EMAIL PROTECTED] wrote: Hi again! I've hit my head against the Java wall for a while, and then some again with Python. What I'm doing is basically map to put MusicBrainz data (artist MEMBER_OF band, album MASTERED_BY artist, etc) into Neo. I don't need an index for searching, as MusicBrainz already has an XML webservice which I can use. But I do need to find a node from its unique UUID, which is a 128-bit number. What are my options? http://wiki.neo4j.org/content/Design_Guide#Search assumes that I know what Maven is and want to use it. I've found http://components.neo4j.org/index-util/ but can't find any API documentation. Can I use it without jumping through Maven hoops? Equally importantly, what indexing options are there with the Python wrapper? This must be a very common problem, so I hope there is a better solution than using a single table in a relational database as an index... Philip ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user