Re: [Neo4j] TransportDublin Route Planner Github Project
Paddy, I took the freedom to format the wiki a bit and put a maven profile into the pom.xml in order to download and expand the data automatically if you run mvn -P import install jetty:run http://github.com/peterneubauer/TransportDublin Is that ok? Would it be possible to do the import automatically to, so we could extract the timetable data live somewhere? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Jul 25, 2010 at 11:02 AM, Paddy paddyf...@gmail.com wrote: I have added a Quick Install Guide on the wiki if anyone would like to try: Download source : 1. Download latest zip source file from: http://github.com/paddydub/TransportDublin/archives/master 2. Extract zip contents into C:\dev\transportdublin\ Download database: 3. Download a prepopulated graph database from neo4j-db.ziphttp://www.transportdublin.ie/neo4j/neo4j-db.zip 4. Extract the neo4j-db.zip file to folder: C:\dev\transportdublin\data\ 5. cd to C:\dev\transportdublin\ Launch Jetty server 6. From the command line type: mvn jetty:run 7. Point your browser to location http://localhost:8080/transportdublin/routeplanner 8. Click on two locations on the map to generate a route Thanks Paddy On Fri, Jul 23, 2010 at 4:53 PM, Paddy paddyf...@gmail.com wrote: Hi, I have updated the wiki with screen shots and information http://wiki.github.com/paddydub/TransportDublin/ and I have uploaded my code and bus stop data sql script. Any suggestions or recommendations would be appreciated. I'm currently populating my graph from a mysql database, I'm working next on implementing the BatchInserter next to speed up the graph setup process Thanks Paddy On Wed, Jul 21, 2010 at 3:26 PM, Anders Nawroth and...@neotechnology.comwrote: Hi Paddy! Some interesting stuff you're working on there! I'd like to write a bit about the differences in neo4j and sql and why neo4j it is a perfect solution for route planning systems, do you think a wiki would be the best option to display the pics? I think the Github wiki of the project good be a good place to put the article. Images can be added to the source repo (just remember to use the raw version of the images as img src) or can be uploaded as downloads of the project. When your writings are in place, it should of course be linked from the Neo4j wiki. WDYT? /anders I will be uploading the code today and tomorrow, just making some last minute changes and writing some documentation. Cheers Paddy ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] TraversalDescription building hickup
Hi all, I just stumbled over the immutable TraversalDescription API (http://components.neo4j.org/neo4j-kernel/apidocs/index.html), which will not modify the object if you do TraversalDescription td = new TraversalDescriptionImpl(); td.depthFirst(); Instead, one needs to reassign td, like TraversalDescription td = new TraversalDescriptionImpl(); td = td.depthFirst(); However, TraversalDescription td = new TraversalDescriptionImpl().depthFirst(); will give you the expected td. IMHO this is unexpected behaviour and hard to get if you just follow the common fluent API and presume a Builder-pattern. Especially since no errors are thrown and you just end up with strange results and unreachable code i e.g. a custom PruneEvaluator etc. True, the API says it is immutable, but still I think this is hard. WDYT? Should we think of changing this to a proper builder.modify().modify etc and finally builder.build() wich gives you the final, immutable instance of TraversalDescription and is clearly understandable by clients? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Python Newbie Struggling Slightly
Hello, I really like the look of neo4j and would like to use it in a project I'm working on http://pppeoplepppowered.blogspot.com/ . I'm new to graphs and how to work with them, it's all new, but really drawn to them having banged my head on SQL schemas for years (and years). My problem with working with python neo4j is that there aren't enough simple (enough) examples. This is sort of thing is great ... http://blog.neo4j.org/2010/03/modeling-categories-in-graph-database.html ... but it doesn't explain a few things and goes wrong in places. For example, from the example, if I try to execute... computers_node = graphdb.node(Name=Computers) ... I get... jpype._jexception.RuntimeExceptionPyRaisable: org.neo4j.graphdb.NotInTransactionException: No transaction found for current thread So I need to add... with db.transaction: computers_node = graphdb.node(Name=Computers) ... which raises the issue of transactions and database connections. I'm unsure when to use transactions and how to use connections. Either way, I regularly seem to bump into a... jpype._jexception.RuntimeExceptionPyRaisable: org.neo4j.kernel.impl.transaction.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error Because I like to use the interpreter and learn what objects can do, and later create .py files and import them. It's not clear if I can have more than one db connection open at once, or I should open and shutdown the database everytime... or is it better to have one database connection hanging around in a file somewhere? At the moment I'm trying to create a script that, takes a csv file of people, adds them, then tries to get the data out somehow, like this... import stuff.neo_utils as neo # see below neo.import_people() #import a csv 7051 #the id of the root_person neo.people(7051) #get the people out via the root_person #nothing! neo.people(8224) # the id of the last_person Node id=7051 My question is this... am I doing it all wrong? Could someone create a very simple example that say, populates a graph, gets data out, manipulates that data and then searches that data (say for an attribute, or to see if it exists etc) in a single python file? So that I can begin to build up my understanding, thanks for listening, tom #!/usr/bin/env python import sys, random, csv from time import sleep from random import randint import neo4j class Person(neo4j.Traversal): types = [ neo4j.Outgoing.is_a ] order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE def people(person_root_id ): try:graphdb.shutdown() except:pass graphdb = neo4j.GraphDatabase( neo_db ) with graphdb.transaction: person_root = graphdb.node[person_root_id] for person_node in Person(person_root): try: print %s %s (@depth=%s) % ( person_node['family_name'], person_node['email'],person_node.depth) except: print person_node graphdb.shutdown( ) # The data is like this... #tas...@york.ac.uk Staff Mr T Smith |Computing Service: | |Vanbrugh College V/C/011| |+44 1904 433847| https://www.york.ac.uk/directory/user.yrk/searchdetail.cfm?scope=staffref=M95%27%22%3DYBD8%5B%3ANEJ%27S%27I%2AX%20%20F%2D6Y2%3D%20SR%21A%409%2C%40E2%3D%205%2EFMOM6A%3A%3EWIHV4T%5D%5E%3B%0A%2B4%2D%2A%3EG%2D%2F6EUS%22BI0%20%0Areferrer=searchResults def import_people(name=Untitled, file='/Users/tomsmith/pppeoplepppowered/staff/staff.csv', ): 'load a lot of people into the database, connecting each to a root Person object by a is_a relationship, spurious I know ' graphdb = neo4j.GraphDatabase( neo_db ) with graphdb.transaction: person_root = graphdb.node(name=Person) # create a root node of sorts person_root_id = person_root.id csvReader = csv.reader(open(file), delimiter=' ', quotechar='|') for row in csvReader: email = row[0] kind = row[1] title = row[2] given_name = row[3] family_name = row[4] department = row[5] org_path = row[6] full_address = row[7] telephone = row[8] src = row[9] url = row[10] external_url = row[11] person = graphdb.node(email=email,title=title, given_name=given_name,family_name=family_name, telephone=telephone,department=department, org_path=org_path,src=src, url=url, external_url=external_url) person.is_a( person_root ) print person, created and linked print done importing! graphdb.shutdown( ) return person_root_id ___
[Neo4j] Python Newbie put another way...
This doesn't work (from the tutorial page)... any ideas where I'm going wrong? Thanks... import neo4j from neo4j.util import Subreference graphdb = neo4j.GraphDatabase( test_neo4j_db ) class SubCategoryProducts(neo4j.Traversal): types = [neo4j.Outgoing.SUBCATEGORY, neo4j.Outgoing.PRODUCT] def isReturnable(self, pos): if pos.is_start: return False return pos.last_relationship.type == 'PRODUCT' def attributes(product_node): for category in categories(product_node): for attr in category.ATTRIBUTE: yield attr class categories(neo4j.Traversal): types = [neo4j.Incoming.PRODUCT, neo4j.Incoming.SUBCATEGORY] def isReturnable(self, pos): return not pos.is_start with graphdb.transaction: attribute_subref_node = Subreference.Node.ATTRIBUTE_ROOT(graphdb) #attribute_subref_node.ATTRIBUTE_TYPE(Price) #Fails, do I need to pass an object? #attribute_subref_node.ATTRIBUTE_TYPE(Length) #attribute_subref_node.ATTRIBUTE_TYPE(Name) category_subref_node = Subreference.Node.CATEGORY_ROOT(graphdb, Name=Products) computers_node = graphdb.node(Name=Computers) #create some categories electronics_node = graphdb.node(Name=Laptops) electronics_node.SUBCATEGORY(computers_node) netbooks_node = graphdb.node(Name=Netbooks) netbooks_node.SUBCATEGORY(computers_node) desktops_node = graphdb.node(Name=Netbooks) desktops_node.SUBCATEGORY(computers_node) #create some products little_dell = graphdb.node(Name=Little Dell, Colour=red, Price=210) little_dell.is_a( netbooks_node ) print little_dell.id little_acer = graphdb.node(Name=Little Acer, Colour=grey) little_acer.is_a( netbooks_node ) print little_acer.id little_eee = graphdb.node(Name=Little EEE, Colour=white, Price=200 ) little_eee.is_a( netbooks_node ) print little_eee.id for rel in computers_node.SUBCATEGORY.outgoing: print rel.end['Name'] for prod in SubCategoryProducts(computers_node): print prod['Name'] for attr in attributes(prod): print attr['Name'], of type , attr.end['Name'] print graphdb.shutdown() ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Querying for nodes that have no relationhip to a specfic node
Hi, I'm considering using neo4j for a current project I'm working on. I need to do the following periodically (e.g. daily): * step 1: for every node, let's call it A, I need to pick n other nodes randomly that fullfill certain attributes and have no relationship to A. * step2: For each of those nodes and A, I calculate some value and store it within the relationship. Regarding step 1, from what I've read, it seems there is no way of querying nodes that have no relationship to a specific node. Of course I could query all the nodes of the database that fullfill certain attributes, store them within a variable, then query all relationhips for node A and then substract those nodes from the array variable. But I think this approach won't work very well as the amount and density of relationships gets higher... Do you have any recommendations? Can you suggest another strategy? Perhaps there is a way of making that query? Any help of appreciated. Alberto. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TraversalDescription building hickup
2010/7/27 Peter Neubauer peter.neuba...@neotechnology.com Hi all, I just stumbled over the immutable TraversalDescription API (http://components.neo4j.org/neo4j-kernel/apidocs/index.html), which will not modify the object if you do TraversalDescription td = new TraversalDescriptionImpl(); td.depthFirst(); Instead, one needs to reassign td, like TraversalDescription td = new TraversalDescriptionImpl(); td = td.depthFirst(); However, TraversalDescription td = new TraversalDescriptionImpl().depthFirst(); will give you the expected td. IMHO this is unexpected behaviour and hard to get if you just follow the common fluent API and presume a Builder-pattern. Especially since no errors are thrown and you just end up with strange results and unreachable code i e.g. a custom PruneEvaluator etc. True, the API says it is immutable, but still I think this is hard. WDYT? Should we think of changing this to a proper builder.modify().modify etc and finally builder.build() wich gives you the final, immutable instance of TraversalDescription and is clearly understandable by clients? I still think the current approach is more useful (although it'd be nice with more input on this). One reason I think it's better is that you can half-bake descriptions as private static final or similar and then complete the descriptions in several different places in your code. You can even pass in descriptions in methods and what not, without any risc of them being modified. I think javadoc should better explain this and it should be expected that developers read javadoc, right? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TraversalDescription building hickup
I may have missed your point. But, FWIW, this model reflects what I would expect from an immutable object. For example: String s = Test; s.replace('T', 't'); // s still contains Test BigInteger and Date are the same way. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Mattias Persson Sent: Tuesday, July 27, 2010 2:41 PM To: Neo4j user discussions Subject: Re: [Neo4j] TraversalDescription building hickup 2010/7/27 Peter Neubauer peter.neuba...@neotechnology.com Hi all, I just stumbled over the immutable TraversalDescription API (http://components.neo4j.org/neo4j-kernel/apidocs/index.html), which will not modify the object if you do TraversalDescription td = new TraversalDescriptionImpl(); td.depthFirst(); Instead, one needs to reassign td, like TraversalDescription td = new TraversalDescriptionImpl(); td = td.depthFirst(); However, TraversalDescription td = new TraversalDescriptionImpl().depthFirst(); will give you the expected td. IMHO this is unexpected behaviour and hard to get if you just follow the common fluent API and presume a Builder-pattern. Especially since no errors are thrown and you just end up with strange results and unreachable code i e.g. a custom PruneEvaluator etc. True, the API says it is immutable, but still I think this is hard. WDYT? Should we think of changing this to a proper builder.modify().modify etc and finally builder.build() wich gives you the final, immutable instance of TraversalDescription and is clearly understandable by clients? I still think the current approach is more useful (although it'd be nice with more input on this). One reason I think it's better is that you can half-bake descriptions as private static final or similar and then complete the descriptions in several different places in your code. You can even pass in descriptions in methods and what not, without any risc of them being modified. I think javadoc should better explain this and it should be expected that developers read javadoc, right? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Enabling LRU cache with BatchInserter
Hi, I'm trying to call enableCache(..) for the following example: http://wiki.neo4j.org/content/Batch_Insert#Using_batch_inserter_together_with_indexing How would I go about doing that? ~Mohit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TraversalDescription building hickup
I think the key point of Peters request is to separate the 'builder' from the 'traverser'. Mattias argument appears to state that if the builder and traverser are the same class (series of immutable instances of the same class), you have more flexibility in refactoring, because you don't have to remember to convert to the traverser as the last step (Peter called build() as the last step of a builder creating a 'really immutable' traverser). The example of String.replace is similar to Mattias description, one immutable object creating another instance of the same class. However, I personally think Peter has a point. Keeping the builder and traverser as two separate objects might force the coder to make a clear decision about when the traverser should no longer be modified. Once the build() method is called, we are done. Before that point the developer still has complete flexibility to modify the traversal rules multiple times in as many steps and places in the code they want, but once they call build(), the returned traverser cannot be modified at all. Despite my argument above favouring Peters approach, I actually do not have a strong opinion on this, and think both ways are good. I confess that had I coded this myself there is a good way I would have done it with one class like Mattias describes. (and I just realized another thing, we normally access traversers in for(Iterable) loops that call iterator(), which itself returns a 'really immutable' object, so perhaps adding another in the chain is overkill :-) On Tue, Jul 27, 2010 at 9:00 PM, Paul A. Jackson paul.jack...@pb.comwrote: I may have missed your point. But, FWIW, this model reflects what I would expect from an immutable object. For example: String s = Test; s.replace('T', 't'); // s still contains Test BigInteger and Date are the same way. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Mattias Persson Sent: Tuesday, July 27, 2010 2:41 PM To: Neo4j user discussions Subject: Re: [Neo4j] TraversalDescription building hickup 2010/7/27 Peter Neubauer peter.neuba...@neotechnology.com Hi all, I just stumbled over the immutable TraversalDescription API (http://components.neo4j.org/neo4j-kernel/apidocs/index.html), which will not modify the object if you do TraversalDescription td = new TraversalDescriptionImpl(); td.depthFirst(); Instead, one needs to reassign td, like TraversalDescription td = new TraversalDescriptionImpl(); td = td.depthFirst(); However, TraversalDescription td = new TraversalDescriptionImpl().depthFirst(); will give you the expected td. IMHO this is unexpected behaviour and hard to get if you just follow the common fluent API and presume a Builder-pattern. Especially since no errors are thrown and you just end up with strange results and unreachable code i e.g. a custom PruneEvaluator etc. True, the API says it is immutable, but still I think this is hard. WDYT? Should we think of changing this to a proper builder.modify().modify etc and finally builder.build() wich gives you the final, immutable instance of TraversalDescription and is clearly understandable by clients? I still think the current approach is more useful (although it'd be nice with more input on this). One reason I think it's better is that you can half-bake descriptions as private static final or similar and then complete the descriptions in several different places in your code. You can even pass in descriptions in methods and what not, without any risc of them being modified. I think javadoc should better explain this and it should be expected that developers read javadoc, right? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] property value encoding
Lately I've played with some OpenStreetMap data... Nodes imported have many properties with a small set of values (road type, point-of-interest type, colour, ...) but I don't know in advance the set of values (sometimes a new value can become standard, sometimes an invalid value is present). Other node properties are just unique text (address, url). To speed up the import process I've tried to apply some kind of compression, I've seen that Neo4j encode property names using a sequence of integers, I've tried to do the same for values of all the properties which I know they contain only a small set. With this encoding the database is obviously much smaller.. after importing sweden.osm the database dir is 552M: 100M neostore.propertystore.db 220M neostore.propertystore.db.arrays 227M neostore.propertystore.db.strings with 'compression' on is 344M: 100M neostore.propertystore.db 220M neostore.propertystore.db.arrays 20M neostore.propertystore.db.strings property value dictionary entries: 16286 property value dictionary size: 387378 bytes I don't know if this is a common use case, but it would be cool to have this kind of compression out of the box! WDYT? Regards, -- Davide Savazzi ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch inserter shutdown taking forever
Since you're doing a depth 1 traversal please use something like this instead: for ( Relationship rel : graphDb.getReferenceNode().getRelationships( Relationships.ROUTE, Direction.OUTGOING ) ) { Node node = rel.getEndNode(); // Do stuff } Since a traverser keeps more memory than a simple call to getRelationships. Another thing, are you doing any write operation in that for-loop of yours? Also do you shut down the batch inserter and start a new EmbeddedGraphDatabase to traverse on, or how do you get a hold of the graphDb? 2010/7/26 Tim Jones bogol...@ymail.com OK, I found out what's taking the time. It's iterating over the result set of a traverser: // visit each Route node, and add it to the array Traverser routes = graphDb.getReferenceNode().traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.DEPTH_ONE, ReturnableEvaluator.ALL_BUT_START_NODE, Relationships.ROUTE, Direction.OUTGOING); for (Node node : routes) { // do stuff } The 'for' loop takes ages. There are probably 2m nodes being returned by that traverser at the moment, and that's only a very small subset of the data I want to add to the database. is there any way to tinker with the neo4j properties or anything to improve performance here? Thanks - Original Message From: Mattias Persson matt...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Sat, July 24, 2010 10:23:02 PM Subject: Re: [Neo4j] Batch inserter shutdown taking forever 2010/7/21 Tim Jones bogol...@ymail.com Hi, I'm using a BatchInserter and a LuceneIndexBatchInserter to insert 5m nodes and 5m relationships into a graph in one go. The insertion seems to work, but shutting down takes forever - it's been 2 hours now. At first, the JVM gave me garbage collection exception, so I've set the heap to 2gb. 'top' tells me that the application is still running: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9994 tim 17 0 2620m 2.3g 238m S 99.5 39.1 115:48.84 java but checking the filesystem by running 'ls -l' a few times doesn't indicate that files are being updated. Is this normal? Is there a way to improve performance? No, it sounds quite weird. Any chance to have a look at your code? I'm loading all my data in one go to ease creating the db - it's simpler to create it from scratch each time instead of updating an existing database - so ideally I don't want to break this job down into multiple smaller jobs (actually, this would be OK if performance was good, but I ran into problems inserting data and retrieving existing nodes). What kind of problems? could you supply code and description of your problems? Problems doing something similar in relational dbs. Also, the API recommends to optimise the batch search index before using it for lookups. I just decided not to take this approach. Thanks, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Enabling LRU cache with BatchInserter
Are you thinking of the method in LuceneIndexService? The batch inserter index doesn't have such a method. Do you have performance problems inserting stuff, or why do you want such a method? 2010/7/27 Mohit Vazirani mohi...@yahoo.com Hi, I'm trying to call enableCache(..) for the following example: http://wiki.neo4j.org/content/Batch_Insert#Using_batch_inserter_together_with_indexing How would I go about doing that? ~Mohit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Enabling LRU cache with BatchInserter
- Original Message From: Mattias Persson matt...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Tue, July 27, 2010 12:31:49 PM Subject: Re: [Neo4j] Enabling LRU cache with BatchInserter Are you thinking of the method in LuceneIndexService? Yes The batch inserter index doesn't have such a method. Do you have performance problems inserting stuff, or why do you want such a method? I'm creating outgoing relations between node pairs obtained from another data source. The external data is clustered on the start node. I'm calling index.getNodes(..) for both nodes and then creating a relation between them which is currently pretty slow. I figured since the data is clustered on the start node, it would speed up the relationship creation if I used an LRU cache. 2010/7/27 Mohit Vazirani mohi...@yahoo.com Hi, I'm trying to call enableCache(..) for the following example: http://wiki.neo4j.org/content/Batch_Insert#Using_batch_inserter_together_with_indexing g How would I go about doing that? ~Mohit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] property value encoding
Mapping property values to a discrete set, and refering to them using their 'id' is quite reminiscent of a foreign key in a relational database. Why not take the next step and make a node for each value, and link all data nodes to the value nodes? This is then a kind of index, a category index. I was thinking about doing this for the OSM importer myself, but I have an aversion to the number of relationships that would then appear. It is still worth considering, as a relationship takes less space than a string. Also, another trick I discussed with the neo4j guys (to mixed response) was to use lucene to index the property values, but then fail to actually save that value to the node. This means that the only existence of the value is in the lucene index. If the only purpose of the value is to find nodes using the index, this is certainly easier than adding relationships. The primary negative comment from the neo4j guys was that lucene is not protected from failure like the neo4j core, so you cannot recreate the index if necessary if you don't have the original properties. So I'm still favouring the category index approach. In cases where the value diversity is very high (very many different values), the index can be split into a tree to improve performance. In cases where very many data nodes link to very few index nodes, there is another trick I'm fond of, and that is the composite index, indexing multiple properties at the same time, which has the effect of increasing the number of index nodes, and decreasing the number of data nodes connected to each index node, which is better for query traversal performance :-) On Tue, Jul 27, 2010 at 9:19 PM, Davide dav...@davidesavazzi.net wrote: Lately I've played with some OpenStreetMap data... Nodes imported have many properties with a small set of values (road type, point-of-interest type, colour, ...) but I don't know in advance the set of values (sometimes a new value can become standard, sometimes an invalid value is present). Other node properties are just unique text (address, url). To speed up the import process I've tried to apply some kind of compression, I've seen that Neo4j encode property names using a sequence of integers, I've tried to do the same for values of all the properties which I know they contain only a small set. With this encoding the database is obviously much smaller.. after importing sweden.osm the database dir is 552M: 100M neostore.propertystore.db 220M neostore.propertystore.db.arrays 227M neostore.propertystore.db.strings with 'compression' on is 344M: 100M neostore.propertystore.db 220M neostore.propertystore.db.arrays 20M neostore.propertystore.db.strings property value dictionary entries: 16286 property value dictionary size: 387378 bytes I don't know if this is a common use case, but it would be cool to have this kind of compression out of the box! WDYT? Regards, -- Davide Savazzi ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphML Nested Graphs
I think I found my answer in http://graphml.graphdrawing.org/primer/graphml-primer.html#Nested The edges between two nodes in a nested graph have to be declared in a graph, which is an ancestor of both nodes in the hierarchy. Note that this is true for our example. Declaring the edge between node n6::n1 and node n4::n0::n0 inside graph n6::n0 would be wrong while declaring it in graph G would be correct. A good policy is to place the edges at the least common ancestor of the nodes in the hierarchy, or at the top level. -Paul From: Paul A. Jackson Sent: Monday, July 26, 2010 2:52 PM To: 'Neo4j user discussions' Subject: GraphML Nested Graphs I am looking into nested graphs and have not found an answer to a specific case. Generally, when a node from one level links to a node in a sub graph, the edge should be defined in the outer graph. In the case where two nodes are in two different (peer) subgraphs at the same level, should the edge go in the outer level that contains them both? See edge e4 below. ?xml version=1.0 encoding=UTF-8? graphml xmlns=http://graphml.graphdrawing.org/xmlns; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd; graph id=G edgedefault=undirected node id=n0/ graph id=n5: edgedefault=undirected node id=n5::n1/ node id=n5::n2/ edge id=e0 source=n5::n1 target=n5::n2/ /graph /node node id=n1 graph id=n6: edgedefault=undirected edge id=e1 source=n6::n1 target=n6::n2/ /graph /node edge id=e2 source=n5::n2 target=n0/ edge id=e3 source=n0 target=n2/ edge id=e4 source=n6::n1 target=n5::n2/ /graph /graphml Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TransportDublin Route Planner Github Project
thanks peter that looks a lot better. I'm working on a Neo4j Ajax combination to display only routes on the map. Ajax and Neo4j is a really powerful combination, it is useful feature to display the neo4j database property's, With a jQuery LiveSearch feature Spring Mvc-Ajax combined with neo4j luceneFullTextQuery, a database can be queried on the fly. What is the best hosting and memory settings for my kind of app? I have tested on a amazon ec2 small instance, but it can take from 0.1 to 10 seconds to find a route depending on distance .Can I replicate the database to allow multiple simultanous requests? Thanks Paddy On Tue, Jul 27, 2010 at 7:02 AM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Paddy, I took the freedom to format the wiki a bit and put a maven profile into the pom.xml in order to download and expand the data automatically if you run mvn -P import install jetty:run http://github.com/peterneubauer/TransportDublin Is that ok? Would it be possible to do the import automatically to, so we could extract the timetable data live somewhere? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Jul 25, 2010 at 11:02 AM, Paddy paddyf...@gmail.com wrote: I have added a Quick Install Guide on the wiki if anyone would like to try: Download source : 1. Download latest zip source file from: http://github.com/paddydub/TransportDublin/archives/master 2. Extract zip contents into C:\dev\transportdublin\ Download database: 3. Download a prepopulated graph database from neo4j-db.ziphttp://www.transportdublin.ie/neo4j/neo4j-db.zip 4. Extract the neo4j-db.zip file to folder: C:\dev\transportdublin\data\ 5. cd to C:\dev\transportdublin\ Launch Jetty server 6. From the command line type: mvn jetty:run 7. Point your browser to location http://localhost:8080/transportdublin/routeplanner 8. Click on two locations on the map to generate a route Thanks Paddy On Fri, Jul 23, 2010 at 4:53 PM, Paddy paddyf...@gmail.com wrote: Hi, I have updated the wiki with screen shots and information http://wiki.github.com/paddydub/TransportDublin/ and I have uploaded my code and bus stop data sql script. Any suggestions or recommendations would be appreciated. I'm currently populating my graph from a mysql database, I'm working next on implementing the BatchInserter next to speed up the graph setup process Thanks Paddy On Wed, Jul 21, 2010 at 3:26 PM, Anders Nawroth and...@neotechnology.comwrote: Hi Paddy! Some interesting stuff you're working on there! I'd like to write a bit about the differences in neo4j and sql and why neo4j it is a perfect solution for route planning systems, do you think a wiki would be the best option to display the pics? I think the Github wiki of the project good be a good place to put the article. Images can be added to the source repo (just remember to use the raw version of the images as img src) or can be uploaded as downloads of the project. When your writings are in place, it should of course be linked from the Neo4j wiki. WDYT? /anders I will be uploading the code today and tomorrow, just making some last minute changes and writing some documentation. Cheers Paddy ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Querying for nodes that have no relationhip to a specfic node
Is it possible to encode the absence of a relationship with a relationship in your application? Date: Tue, 27 Jul 2010 18:52:10 +0100 From: alberto.perd...@gmail.com To: user@lists.neo4j.org Subject: [Neo4j] Querying for nodes that have no relationhip to a specfic node Hi, I'm considering using neo4j for a current project I'm working on. I need to do the following periodically (e.g. daily): * step 1: for every node, let's call it A, I need to pick n other nodes randomly that fullfill certain attributes and have no relationship to A. * step2: For each of those nodes and A, I calculate some value and store it within the relationship. Regarding step 1, from what I've read, it seems there is no way of querying nodes that have no relationship to a specific node. Of course I could query all the nodes of the database that fullfill certain attributes, store them within a variable, then query all relationhips for node A and then substract those nodes from the array variable. But I think this approach won't work very well as the amount and density of relationships gets higher... Do you have any recommendations? Can you suggest another strategy? Perhaps there is a way of making that query? Any help of appreciated. Alberto. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user _ New Windows 7: Simplify what you do everyday. Find the right PC for you. http://windows.microsoft.com/shop ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Querying for nodes that have no relationhip to a specfic node
If this is feasible in Alberto's application, you have to consider that you will be creating a complete graph, and for such a graph with n nodes, you'll have O(n^2) relationships. This can grow really, really fast. Besides, it would turn the insertion of a new node into a potentially slow operation, as you would need to assign the absence relationship to every existing node. []'s Vitor On Tue, Jul 27, 2010 at 8:51 PM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Is it possible to encode the absence of a relationship with a relationship in your application? Date: Tue, 27 Jul 2010 18:52:10 +0100 From: alberto.perd...@gmail.com To: user@lists.neo4j.org Subject: [Neo4j] Querying for nodes that have no relationhip to a specfic node Hi, I'm considering using neo4j for a current project I'm working on. I need to do the following periodically (e.g. daily): * step 1: for every node, let's call it A, I need to pick n other nodes randomly that fullfill certain attributes and have no relationship to A. * step2: For each of those nodes and A, I calculate some value and store it within the relationship. Regarding step 1, from what I've read, it seems there is no way of querying nodes that have no relationship to a specific node. Of course I could query all the nodes of the database that fullfill certain attributes, store them within a variable, then query all relationhips for node A and then substract those nodes from the array variable. But I think this approach won't work very well as the amount and density of relationships gets higher... Do you have any recommendations? Can you suggest another strategy? Perhaps there is a way of making that query? Any help of appreciated. Alberto. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user _ New Windows 7: Simplify what you do everyday. Find the right PC for you. http://windows.microsoft.com/shop ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Querying for nodes that have no relationhip to a specfic node
Sounds like a pretty easy SQL query, though. ;-) Actually the random sampling aspect definitely throws a complication into the requirements. I can't even picture how to achieve that in Neo without first obtain some (large) set of nodes and using a randomizer to select from the set/array. Iterating on getAllNodes and stopping after n matches wouldn't meet the random requirement. It might be helpful to better understand the application/domain requirements and maybe there are alternative ways to achieve the same results. Original Message Subject: Re: [Neo4j] Querying for nodes that have no relationhip to a specfic node From: Vitor De Mario [1]vitordema...@gmail.com Date: Tue, July 27, 2010 8:24 pm To: Neo4j user discussions [2]u...@lists.neo4j.org If this is feasible in Alberto's application, you have to consider that you will be creating a complete graph, and for such a graph with n nodes, you'll have O(n^2) relationships. This can grow really, really fast. Besides, it would turn the insertion of a new node into a potentially slow operation, as you would need to assign the absence relationship to every existing node. []'s Vitor On Tue, Jul 27, 2010 at 8:51 PM, Niels Hoogeveen [3]pd_aficion...@hotmail.comwrote: Is it possible to encode the absence of a relationship with a relationship in your application? Date: Tue, 27 Jul 2010 18:52:10 +0100 From: [4]alberto.perd...@gmail.com To: [5]u...@lists.neo4j.org Subject: [Neo4j] Querying for nodes that have no relationhip to a specfic node Hi, I'm considering using neo4j for a current project I'm working on. I need to do the following periodically (e.g. daily): * step 1: for every node, let's call it A, I need to pick n other nodes randomly that fullfill certain attributes and have no relationship to A. * step2: For each of those nodes and A, I calculate some value and store it within the relationship. Regarding step 1, from what I've read, it seems there is no way of querying nodes that have no relationship to a specific node. Of course I could query all the nodes of the database that fullfill certain attributes, store them within a variable, then query all relationhips for node A and then substract those nodes from the array variable. But I think this approach won't work very well as the amount and density of relationships gets higher... Do you have any recommendations? Can you suggest another strategy? Perhaps there is a way of making that query? Any help of appreciated. Alberto. ___ Neo4j mailing list [6]u...@lists.neo4j.org [7]https://lists.neo4j.org/mailman/listinfo/user _ New Windows 7: Simplify what you do everyday. Find the right PC for you. [8]http://windows.microsoft.com/shop ___ Neo4j mailing list [9]u...@lists.neo4j.org [10]https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list [11]u...@lists.neo4j.org [12]https://lists.neo4j.org/mailman/listinfo/user References 1. mailto://vitordema...@gmail.com/ 2. mailto://user@lists.neo4j.org/ 3. mailto://pd_aficion...@hotmail.com/ 4. mailto://alberto.perd...@gmail.com/ 5. mailto://user@lists.neo4j.org/ 6. mailto://User@lists.neo4j.org/ 7. https://lists.neo4j.org/mailman/listinfo/user 8. http://windows.microsoft.com/shop 9. mailto://User@lists.neo4j.org/ 10. https://lists.neo4j.org/mailman/listinfo/user 11. mailto://User@lists.neo4j.org/ 12. https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user