Re: [Neo] meta meta classes
2010/3/30 Niels Hoogeveen pd_aficion...@hotmail.com MetaModelObject already has a getter (without using the word get) to access the node. Wrapping MetaModelObject to give it a Node-interface makes it possible to directly write: metalObject.setProperty(a, b) instead of metaobject.node.setPropert(a, b) If that were all, I wouldn't make a post about it. The more interesting part is the meta modeling of the node gotten from a MetaModelObject. This node represents the class, but it is not an instance of any class itself. That's where the reify method comes into play, which takes the node from a MetaModelObject and creates a class with the same name, but in a different namespace, and makes the node an instance of this new class. With that construction it becomes possible to model the relationships/properties of a class. There are many examples where this can be handy. I already gave the example of HTML tags, where the attributes can be modeled as properties and the tagname as a property of the meta class. Another example is: All countries have subdivision. Countries and their subdivisions all are instances of a certain class. There is one class for country, but there is a set of classes for subdivisions. In some countries, a subdivision is called province, in others it's a state or a district. Subdivision of various countries can have different sets of properties and relations. How to model the fact that the instance United States of America has subdivisions of the class State_(US)? Using the reify method we can make a class United States of America, where we can model that the ClassRange of State_(US) is United States of America. Or if we want the relationship to point the other way, that the United States of America has 50 (use of cardinality) subdivisions of the class State_(US). Here you take a step into modeling the actual data into the meta model which describes the data. Is it desirable to first model exactly how the data will look, and then add data so that it looks like that? I get a feeling that the data is described twice here... Of course all this can directly be expressed with nodes and relationships, but that's what the meta model does anyway. I do have one peeve with the meta model API. The class DataRange has a constructor: DataRange(String datatype, Object... values) I'd much rather see a new Restrictable class Datavalue, and see a constructor: DataRange(String datatype, Datavalue... values) That way the possible values a property can have additional properties and relationships (eg. link to Wordnet definition or Wikipedia entry). Kind regards, Niels Hoogeveen Date: Tue, 30 Mar 2010 09:30:10 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes Would making the underlying Node publically available (via a getter) be virtually the same thing? In that case the meta model classes could have such a getter. 2010/3/26 Niels Hoogeveen pd_aficion...@hotmail.com Hi Peter, I added a Wiki entry in my github repo called Reification of meta classes and meta properties: http://wiki.github.com/NielsHoogeveen/Scala-Neo4j-utils/reification-of-meta-classes-and-meta-properties The source code for the Scala wrappers can be found found in my repo: http://github.com/NielsHoogeveen/Scala-Neo4j-utils Kind regards, Niels Hoogeveen From: neubauer.pe...@gmail.com Date: Fri, 26 Mar 2010 17:25:01 +0100 To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes Awesome Niels! maybe you could blog or document some cool example on this? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.tinkerpop.com - Processing for Internet-scale graphs. http://www.thoughtmade.com - Scandinavias coolest Bring-a-Thing party. On Fri, Mar 26, 2010 at 5:22 PM, Niels Hoogeveen pd_aficion...@hotmail.com wrote: Using Scala, I was actually able to extend MetaModelThing to act as a Node and MetaModelClass to have shadowing functionality for both MetaModelClasses and for MetaModelProperties, without touching the original source code. To: user@lists.neo4j.org From: rick.bullo...@burningskysoftware.com Date: Fri, 26 Mar 2010 14:29:03 + Subject: Re: [Neo] meta meta classes Such are the joys and challenges of frameworks and abstractions. Sometimes you do need to get close to the metal though, to achieve specific functional and performance requirements. Thus the reason open source frameworks are awesome. At least we can
Re: [Neo] How to efficiently query in Neo4J?
Since no one responded yesterday, I wanted to re-emphasize that there are probably substantial optimizations that can be made in a well-known problem domain such as this. For example, by using pre-calculated relevance measures for tags, and by narrowing the returned set of posts/nodes as rapidly as possible using the least used tag(s) in progressive order. It would be quite trivial (and reasonably performant) to maintain a pair of properties on each node in the tag hierarchy that count the # of relationships of the tag and all its children. Each time a tagging relationship was added to a post, simply add 1 to this property for the tag node and all its ancestors/parents. Then, when you are provided with a list of tags to search upon, order them by the least frequently used tag by leveraging this metric and execute your traversals/set analysis in that order. I also think my proposal for a two-directional search (first one from the direction of the least frequently used tag to the posts that include it, followed by a search from each of those posts back to its tags as described in a previous message) could be quite fast. Another compound index approach that can be used, which is somewhat of a brute force method, is to maintain a property on each tag node that consists of its aggregrate name - e.g. Europe.Italy.Toscana.Siena or Activities.Active.Cycling.MountainBiking. When doing a search for Cycling activities in Italy, you could grab the aggregate names for Italy (Europe.Italy) and Cycling (Activities.Active.Cycling), then, using whatever mechanism you choose for your initial node traversal (exhaustive or least-frequently-used tag), you can compare the aggregate name for the tags assigned to a post node to the aggregate names for the desired nodes using a simple String.startsWith(). For example, if I posted regarding a mountain bike ride I took in the hills around Siena, tagged with the above aggregate names, it would successfully match. My first thought was that this could be problematic if a tag term appeared multiple times in the tag hierarchy, but that's easily managed on the query side. Just trying to make the point that sometime abstract or generic traversal schemes aren't always optimal and that it is often worth the effor to explore domain-specific approaches. Does that make any sense? Original Message Subject: Re: [Neo] How to efficiently query in Neo4J? From: Craig Taverner cr...@amanzi.com Date: Wed, April 07, 2010 7:05 pm To: Neo user discussions user@lists.neo4j.org Hi Alastair, I have been using what you tag the 'composite index' although in mysql. Its fast, but a pain to manage (as you need to keep the index up to date), so I would like to stay away from indexes *if possible*. I would think that you only need to take action when you add or modify a node, and then only to (re)connect it to the index tree (creating index nodes on demand, if missing). This can be embedded in your domain classes, so indexing is automatic. You can even synchronously 'garbage-collect' unused index nodes (if the node unlinked was the last node for that index node). I think the index-service for this needs to be well tested for all scenarios, but should ultimately have a very simple API, with no manual management requirement. My one concern with the composite index for your case is that all my thinking in this has been for numerical indexes, where I plan to query with inequalities (eg. return all restaurants with rating = 4 stars). I've not thought about how to solve hierarchical tags like you have. One further optimisation is to only store new items in the hash on the first traversal. Then, in the subsequent traversals, if the key does not exist, there is no need to add key with count 1, as it cannot ever be emitted. This limits the memory requirements to the order of the first traversal, so if you pick that well, it should be better. Nice idea. It makes your approach more like the 'one set intersection' approach in term of memory. Picking a good first query seems a common need for many of the solutions. I presume RDBMS have a query optimization phase that figures that out. I'm hoping to completely avoid that kind of non-deterministic approach with the composite index. Cheers, Craig ___ Neo mailing list User@lists.neo4j.org [1]https://lists.neo4j.org/mailman/listinfo/user References 1. https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Unable to memory map
Hi, The read only version is not faster on reads compared to a writable store. Internally the only difference is we open files in read only mode. The reason you get the error is that your OS does not support to place a memory mapped region to a file (opened in read only mode) when the region maps outside the file data (in write mode the file will grow in size when that happens). -Johan On Mon, Mar 29, 2010 at 9:03 PM, Marc Preddie mpred...@gmail.com wrote: Hi, I've had some time to look into this issue and it seems that when using the ReadOnly versions of the classes, I get the memory mapping warnings and when using the Writable versions of the classes, the warning does not occur (I'm assuming memory mapping gets enabled). I'm not against using the writable versions of the classes; my only concern is performance. Are the readonly versions faster that the writable versions? And if they are; then if memory mapping is not enabled, are they faster that the writable versions with memory mapping? I'll run some tests, but I guess I would like an expert opinion. Regards, Marc On Mon, Mar 22, 2010 at 10:24 AM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Hi, We have seen this message before emitted as a warning from Neo4j. Are you seing this as a warning as well, or are you getting an exception thrown to your application code? It's hard to deal with these errors since nio only throws IOException, and not any more semantic information than that, I believe we deal with all cases by issuing a warning and then falling back to another method of performing the same operation, but if you are getting exceptions we need to resolve it. If you are indeed getting exceptions, some code that triggers it would be very helpful. Cheers, Tobias On Wed, Mar 17, 2010 at 1:47 PM, Marc Preddie mpred...@gmail.com wrote: Hi, I've look at the mailing list and found 1 similar situation, but no real solution. So I was hoping someone could shed some light on this. I seem to have an issue with neo4j being able to use memory mapped files. I've run my service on Win XP 64bit, Mac OSX Snow Leopard 10.6.2 and Centos 5.x 64bit and always get the same error when launching. I'm using APOC 1.0 and have a DB of approx 600M. In my neo config I allocate about 5M more for each type of file than the actual file size (I've tried multiple different settings). On each machine I also leave at least 1.5G for the OS and have at least 2.5G heap for the Java process. I'm also using the classes EmbeddedReadOnlyGraphDatabase and LuceneReadOnlyIndexService to access and browse DB. Neo config neostore.nodestore.db.mapped_memory=10M neostore.relationshipstore.db.mapped_memory=110M neostore.propertystore.db.mapped_memory=85M neostore.propertystore.db.index.mapped_memory=10M neostore.propertystore.db.index.keys.mapped_memory=10M neostore.propertystore.db.strings.mapped_memory=320M neostore.propertystore.db.arrays.mapped_memory=10M Here is the error org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map pos=3005872 recordSize=33 totalSize=1153416 at org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:59) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:530) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:430) at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:122) at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:459) at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:248) at org.neo4j.kernel.impl.nioneo.xa.NeoReadTransaction.getMoreRelationships(NeoReadTransaction.java:103) at org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.getMoreRelationships(NioNeoDbPersistenceSource.java:275) at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:93) at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:585) at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:332) at org.neo4j.kernel.impl.core.NodeImpl.ensureFullRelationships(NodeImpl.java:320) at org.neo4j.kernel.impl.core.NodeImpl.getAllRelationshipsOfType(NodeImpl.java:129) at org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:179) at org.neo4j.kernel.impl.core.NodeProxy.getSingleRelationship(NodeProxy.java:98) Caused by: java.io.IOException: Access is denied at sun.nio.ch.FileChannelImpl.truncate0(Native Method) at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:728) at
Re: [Neo] meta meta classes
The example of the tag library and countries/sub-divisions are not necessarily similar. The first shows the need to model the properties of a class. The second example shows the need to have singleton classes, which is a different concept, and something that cannot be done out of the box, but I will show a solution that requires minimal modifications to the current software. Suppose we want to model countries and their sub-divisions. We do the following: create class country create class sub-division create property has-sub-division make has-subdivision a property of country Now we like to populate the database, since all countries have different types of sub-divisions we need to create those: create class french_region create class canadian_province create class canadian_territory create class US_state etc. Populate the various classes with instances: create node for Alsace and make it an instance of french_region create node for Aquitaine and make it an instance of french_region create node for Alberta and make it an instance of Canadian_province create node for Nanavut and make it an instance of Canadian_territory create node for British Columbia and make it an instance of Canadian_province create node for Alabama and make it an instance of US_State create node for Alaska and make it an instance of US_State etc. We'd like to state that each country has its own restriction on the type of subdivision. To do that we need to create classes for each country. create class France create class Canada create class United_States_of_America etc. make class France a subclass of Country make class Canada a subclass of Country make class United_States_of_America a subclass of Country etc. create restriction for France on has-sub-division with range french_province and cardinality = 26 create restriction for Canada on has-sub-division with range canadian_province and cardinality = 10 create restriction for Canada on has-sub-division with range canadian_territory and cardinality = 3 create restriction for United_States_of_America on has-sub-division with range US_State and cardinality = 50 etc. Now we have classes for each country, but no instances. Unfortunately we cannot say that a class is an instance of itself, that would require a relationsship where the endnode equals the startnode, which Neo4J doesn't allow. So we have to create separate instances for each country. create node for France create node for Canada create node for United_States_Of_America etc. make node France an instance of class France make node Canada an instance of class Canada make node United_States_of_America an instance of class United_States_of_America And finally link the subdivisions to their countries: France has-sub-division Alsace France has-sub-division Aquitaine Canada has-sub-division-of Alberta Canada has-subdivision British Columbia Canada has-subdivision Nanavut United_States_of_America has-sub-division Alabama United_States_of_America has-sub-division Alaska So we end up having a class for each country and an instance for each country. Writing this down, I realize the patch I sent you a few days ago, contains a minor flaw, that needs to be fixed. In that patch I added an indexed property uri to the meta model, to bring it in line with what is being done in the RDF module, and to make certain that the same URI is not used for two different properties. Without unicity a class can have several different ProperyTypes with the same name. In that situation the lookup of the PropertyType of a property or relationship becomes impossible. The flaw in my patch is the name of the uri property, which should be something like class_uri. That way the class of each country can be given the same URI as the instance of each country, because they live in different name spaces. This same technique is used in OWL to provide punning. An instance and a class can have the same URI, because instances in OWL live in a different namespaces from classes. Through the use of the uri property and the class_uri property we can also distill that each country class is a singleton class, because there exists an instance with the same URI. That way we can work around the limitation that relationships cannot have the same start and end node. Furthermore, it allows for some extra restrictions to MetaModelClass with the following logic: If a class has exactly one instance where the uri of that instance equals the class_uri of the class, no more instances can be added And if there is an instance of a class without a uri that equals the class_uri, no instances can be added where the uri of the instance equals the class_uri of the class. With that logic, we have proper singleton classes in the meta model of Neo4J. Kind regards, Niels Hoogeveen Date: Thu, 8 Apr 2010 11:37:37 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes 2010/3/30 Niels
Re: [Neo] getNumberOfIdsInUse(Node.class)) return -1
Hi, I had a look at this and can not figure out why -1 is returned. When running the kernel in normal (write) mode the return value of number of ids in use will only be correct if all previous shutdowns have executed cleanly. This is an optimization to reduce the time spent in recovery rebuilding id generators after a crash/non clean shutdown. After a crash/non clean shutdown the number of ids in use will always be the highest id in use + 1. To force a full rebuild of the id generators on each startup (on a non clean shutdown) pass in the following configuration: rebuild_idgenerators_fast=false In read only mode the return value will always be the highest id in use + 1. You could try to delete the neostore.nodestore.db.id and pass in rebuild_idgenerators_fast=false as configuration when starting up (this will take a long time if the node store file is large). If you still get incorrect results send me a compressed version of the neostore.nodestore.db.id file and I will have a look at it. Regards, -Johan On Tue, Apr 6, 2010 at 3:38 PM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Sorry, we have not had time to look into that yet. I'll let you know when we have. On Mon, Apr 5, 2010 at 12:31 PM, Laurent Laborde kerdez...@gmail.comwrote: Any news ? -- Ker2x On Fri, Mar 26, 2010 at 12:05 PM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: Ok, thanks. We'll look into it. On Fri, Mar 26, 2010 at 11:49 AM, Laurent Laborde kerdez...@gmail.com wrote: something between 100 millions and 1 billions, i guess. the DB contain the result of my collatz code from 1 to 100 millions. -- Ker2x On Fri, Mar 26, 2010 at 11:40 AM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: If you have a large number of nodes it could be a truncation error from long to int somewhere, how many nodes to you estimate that you have? It is a bug so we will fix it, but if we know the approximate estimated size it would help in finding the cause. /Tobias On Fri, Mar 26, 2010 at 7:59 AM, Laurent Laborde kerdez...@gmail.com wrote: my code do a : System.out.println(Number of nodes : + neo.getConfig().getNeoModule().getNodeManager().getNumberOfIdsInUse(Node.class)); it print : Number of nodes : -1 why does it print -1 ? how can i count node ? thank you :) ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] creating nodes with our own id
Did you make any progress with this? I could provide you with an example as well, here goes: GraphDatabaseService graphDb = new EmbeddedGraphDatabase( my/path ); IndexService index = new LuceneIndexService( graphDb ); // This is how to create and index a UUID for a node. ... Node node = graphDb.createNode(); node.setProperty( uuid, java.util.UUID.randomUUID().toString() ); index.index( uuid, node.getProperty( uuid ) ); ... // This is how to get a node for a certain UUID ... Node node = index.getSingleNode( uuid, 9cd0b5b0-7cb4-4806-8b54-39803b1a44e2 ); ... 2010/4/6 Mattias Persson matt...@neotechnology.com 2010/4/5 Niels Hoogeveen pd_aficion...@hotmail.com UUID's are for all practical purposes unique. so you can use those for an ID and have uniqueness for free. +1 That would also be my answer to that. Kind regards, Niels Hoogeveen Date: Mon, 5 Apr 2010 10:54:08 +0530 From: sivait...@gmail.com To: matt...@neotechnology.com CC: user@lists.neo4j.org Subject: Re: [Neo] creating nodes with our own id Hi Mattias Persson, Thanks for your replay. But setting property cannot give your node uniqueness. I want to use my own Id for unique node represenattaion otherwise i have to remember the ids of nodes when i want the information back. Thanks, Bujji On Wed, Mar 31, 2010 at 4:52 PM, Mattias Persson matt...@neotechnology.comwrote: So, the LuceneIndexService is in the neo4j-index component (as I referred to in the previous mail), http://components.neo4j.org/neo4j-index/ . It is a separate component which depends on the Neo4j kernel component. Source code links are available at the above page, for short it's https://svn.neo4j.org/components/index/trunk/ . Also neo4j-index in turn have its own dependencies, f.ex. lucene and the neo4j-commons component, so it's recommended to use a dependency manager, f.ex. maven to gather all the dependencies, see http://wiki.neo4j.org/content/Getting_Started_Guidefor more information about that. 2010/3/31 Bujji sivait...@gmail.com hi Mattias Persson, Thanks for your quick response. what I see from realese 1.0 is there is no lucene(indexer) component in it please tell me where and how i get the source from repository. Thanks and Regards, Bujji Message: 7 Date: Wed, 31 Mar 2010 09:58:40 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo] creating nodes with our own id To: Neo user discussions user@lists.neo4j.org Message-ID: k2kacdd47331003310058idbbbf320h956430a2e0289...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 The node ids shouldn't be used for such lookups. Either you traverse to them via relationships and other nodes, or you can use the neo4j-index component, http://components.neo4j.org/neo4j-index/ where you can index nodes and do lookups, f.ex: GraphDatabaseService graphDb = new EmbeddedGraphDatabase( my/path ); IndexService index = new LuceneIndexService( graphDb ); // withing transaction Node myNode = graphDb.createNode(); node.setProperty( uid, abc123 ); index.index( node, uid, node.getProperty( uid ) ); Node myNodeFoundViaIndex = index.getSingleNode( uid, abc123 ); NOTE: Indexing operations automatically participates in neo4j transactions 2010/3/31 Bujji sivait...@gmail.com hi all, i am not clear on how to use the nodes once we create them with identifiers generated by the program. how do i remember them i want to have my own id for each node when i am creating a node is that possible what are the changes i have to made to work like that otherwise give me any working example that uses neo4j as it is and how it is using its id's as well plz help me Thanks bujji ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com]
Re: [Neo] Node not found using BatchInserter
If it's not a very large data set (or you have enough RAM) you could keep stuff like that in a HashMap really, that's how I do it sometimes... that way you can get rid of that extra lookup and it'll be faster. So you insert relationship as usual and in addition store that relationship (i.e. start node id, end node id) in f.ex. a HashMap for later lookup. 2010/4/1 Amir Hossein Jadidinejad amir.jad...@yahoo.com Ok. Thank you very much. I just want to load a huge graph but during insertion I have to lookup for previous relations (In order to prevent of duplicate relations). Using BatchInserter isn't applicable?, Better idea? --- On Thu, 4/1/10, Mattias Persson matt...@neotechnology.com wrote: From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo] Node not found using BatchInserter To: Neo user discussions user@lists.neo4j.org Date: Thursday, April 1, 2010, 2:32 AM rel_itr.next() returns relationship ids, not node ids... that's why you get those NotFoundExceptions. What you'd need to do is to inserter.getRelationshipById( rel-id ) on those ids and get either start or end node from it. But, is it really right to use the batch inserter in your case? BatchInserter is only meant to be used if you're doing a one-time initial loading of a big dataset, but never in production or when you have a database containing data. Use the EmbeddedGraphDatabase for normal use. 2010/3/31 Amir Hossein Jadidinejad amir.jad...@yahoo.com Hi, Check the following code: for (IteratorLong rel_itr = inserter.getRelationshipIds(current_node).iterator(); rel_itr.hasNext();) { long neighbor = rel_itr.next(); if (neighbor != current_node neighbor != -1) { try { exist_neighbors.add(inserter.getNodeProperties(neighbor).get(cui).toString()); } catch (Exception e) { e.printStackTrace(); } } } After running, I have a lot of this error: org.neo4j.graphdb.NotFoundException: id=3225225 at org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.getNodeRecord(BatchInserterImpl.java:517) at org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.getNodeProperties(BatchInserterImpl.java:238) at org.qiau.wnng.build.BuildGraph.addAllNodes(BuildGraph.java:220) at org.qiau.wnng.build.BuildGraph.main(BuildGraph.java:288) Is it possible that a neighbor node not found while getRelationshipIds method return it?! ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Requirements for an event framework for Neo4j
Well, these were the kind of questions I would like to get input on, what is it that you need. But since I am a user as well as a designer of this I guess I could go ahead and answer these questions from my perspective. I'll do so inline. On Wed, Mar 31, 2010 at 5:26 PM, Rick Bullotta rick.bullo...@burningskysoftware.com wrote: Hi, Tobias. That's awesome news. A few general questions regarding an event framework for Neo4J... - In the current implementation, there's a thread affinity for transactions. I am guessing that this could create big challenges for proactive handlers that are potentially executed on a different thread? My thinking around this is that the event handlers would get access to some sort of objects that represent the changes made in the transaction. These objects would be possible to access outside of a transactional context. The Proactive handlers would however have to be executed synchronously in the same thread. The reactive handlers would execute on a different thread, and for them it would be nice to be able to operate on the graph without needing a transactional context, but I guess opening a read transaction isn't that big of a deal here anyway, so I think it will work out. - Will the handlers be synchronous or asynchronous? I answered this above... - Also, another consideration is whether or not you want to provide support for event folding for chatty changes to properties on nodes/relationships (e.g. you choose the quality of service - all changes or most recent changes only if you haven't yet processed the mutation event). I would like to keep the number of events fired to as low as possible, meaning that a onNodePropertyChage() event is probably too chatty, onBeforeWriteTransactionCommit(SomeObjectWithTheChanges) is probably a better level. But any input on what you would need is useful. So I would say that you would only observe the changes that were present at commit, and no events would be fired before commit. - What do you envision passing along with events? A full copy of the node/relationship? Only the mutated property? If we can keep it to only be the mutated state that would be great. If we can limit ourselves to the node with this ID changed somehow that would be even better. Actually I think we could limit ourselves to that since the proactive events could be fired (in the same thread as the transaction is executing in) while the transaction is still open, meaning that the modified nodes and relationships are still available, and in the reactive handlers you could open a transaction to get to the current state (the changed state might already be stale anyway). - Would there be support for bucketed notifications that would allow notifications on multiple property changes on a node to be processed as a single entity? See my answer to the folding question. Looking forward to seeing how this all materializes! Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson Sent: Wednesday, March 31, 2010 6:39 AM To: Neo user discussions Subject: [Neo] Requirements for an event framework for Neo4j Fellow developers! The time has come to start the work on an event framework for Neo4j. In order to do a good work at this we would get input on what requirements you have on an event framework. We would like to get a list of use cases for which you would use an event framework, along with the features you think the use case would need from the event framework (i.e. which events you would like to receive notification about, and when). We would also like you to motivate why these features are required by the use case. Events can easily degrade performance if the framework is ill designed, so we would like to keep things very lean. We have made some early analysis and arrived at the following conclusions: * There can be two kinds of event handlers: Proactive event handlers and Reactive event handlers. Proactive event handlers have the ability to preempt operations and Reactive event handlers simply react to an event and cannot cause the event to not succeed. * There are three kinds of events in Neo4j kernel: - Lifecycle events, such as shutdown. - Transactional events, such as start commit, commit successful, rollback, etc. - Data modification events, such as node created, property changed, relationship removed, etc. It might be possible that other components, such as the indexing component, would want to add more events to the event framework. These are of course just some initial input to get your thoughts going, feel free to think outside of the constraints above. Our ultimate goal is to create an event framework that is as useful as possible while maintaining -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857
Re: [Neo] Some feedback on ZooKeeper use in Neo4j zha.
On 04/08/2010 03:20 AM, Johan Svensson wrote: Hi Patrick, Thanks for the feedback. I will have a look at this and implement handling for disconnection and expiration of sessions. No problem. We'll be psyched to see you roll this out. Regarding the GC issues we are well aware of these (hopefully the new garbage first or G1 GC will solve these problems). As you say the concurrent mark sweep GC helps a lot but more important to avoid GC trashing is to make sure there is more (10-15%) available heap than the application ever consumes at any given moment. Agree. HBase has tested G1 in 1.6.x but so far it is not stable enough for production use. I do have a question regarding ZooKeeper. Is there a reason why there is no embedded version? Now I have to start two JVMs on each machine when I really just want to: // start a zookeeper server on this machine ZooKeeperServer server = new ZooKeeperServer( 2181 ); // start a client and pass in some zookeeper servers ZooKeeper zoo = new ZooKeeper( localhost:2181, otherhost:2181, ..., ... ); I'm not sure what you mean by embedded. In production you typically want to have dedicated hosts for the servers. See this page for some insight http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting HBase wraps the zk server for their quickstart type use cases (in their startup scripts). But in large online production serving environments you typically run a ZK cluster separately from the client application. Patrick Regards, -Johan On Wed, Apr 7, 2010 at 12:08 AM, Patrick Huntph...@apache.org wrote: Hi, I'm Patrick (http://twitter.com/phunt) from the ZooKeeper team. Peter Neubauer brought to my attention today that you are considering use of ZooKeeper in Neo4j, that's great! I took a quick look at the code you currently have in SVN and wanted to provide a bit of feedback. I don't know your domain requirements but in general the mechanics of ZooClient use look fine. The use of 5second timeout is fine. This allows you to detect a client (zk client) failure after just 5 seconds. So if the node/process crashes you'd identify this after 5 seconds, same if a network connection fails, etc... One thing you may not have considered though, is that anything that causes the client to not be able to heartbeat to the server would also cause the session to be expired (sessions are expired when the zk cluster fails to hear from the client w/in the timeout time) - so long GC pauses could trigger this as well. In 1.6.x jvms we've seen that the GC can pause all threads for very long periods (in some cases with hbase we saw 4 minute pauses for gc). HBase was the first to see this, we worked with Solr early on to help them understand this issue as well. The problem can be alleviated somewhat by using the CMS/incremental GC options in the JVM, however it cannot be eliminated entirely (in some cases the Gc will still drop back to parallel). You need to consider the impact of GC on your domain and how to best handle it. See this JIRA for details on our discussion with Solr, you might gain some good insight:http://bit.ly/d7OSQ1 https://issues.apache.org/jira/browse/SOLR-1277 I did notice that ZooClient is not handling disconnection and expiration of the session in the process method. At the very least you need to handle the expiration, you may need to do something for disconnection, but this depends on whether you have active or passive actors (masters). Here's a good link on session lifecycle: http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions You might also want to setup a wiki page similar to these at some point, it would help us with future discussion, feedback and provide insight for devs/users: http://wiki.apache.org/hadoop/ZooKeeper/HBaseAndZooKeeper http://wiki.apache.org/solr/ZooKeeperIntegration Regards and good luck, Patrick ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] meta meta classes
Your point about the cardinality restriction is a correct observation. In fact it would be better to create a is-subdivision-of PropertyType on sub-division and give that a range country with a cardinality of 1. Then for each subclass of sub-division a restriction should be set, naming the country class this specific sub-division class applies to. Still, it requires each country to be defined as both a class and an instance. Date: Thu, 8 Apr 2010 19:15:30 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes So, you describe the model (with country, sub-division and has-sub-division) which is OK! Then you not just want to add data which would conform to it, you also describe the highest level of that data in the meta model itself with cardinality for how many sub-divisions each such level must contain. My first though here is: why put actual data into the meta model (the design isn't intended for that)? The second is: why (since you put actual data into the meta model) would you stop there? why only say that France have 10 subdivisions? You don't say exactly which subdivisions or how many/which subdivisions each subdivision has, a.s.o. Which benefits would modeling only the highest level of data get you? And if you could describe the entire data in the meta model, you end up with a meta model which describes the entire data and then, in addition, the data which is exactly the same as the meta model... what benefits does that give you? I'm just confused about the fact that you want to have the meta model (classes, properties, restrictions) and (some of) the actual data modeled in the meta model, whereas the meta model was intended to model the meta model (the UML diagram, so to speak) and not the actual data it would have conform to it. Help me understand which benefits you're after by modeling the top level of your actual data into the meta model itself. Best, Mattias 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com The example of the tag library and countries/sub-divisions are not necessarily similar. The first shows the need to model the properties of a class. The second example shows the need to have singleton classes, which is a different concept, and something that cannot be done out of the box, but I will show a solution that requires minimal modifications to the current software. Suppose we want to model countries and their sub-divisions. We do the following: create class country create class sub-division create property has-sub-division make has-subdivision a property of country Now we like to populate the database, since all countries have different types of sub-divisions we need to create those: create class french_region create class canadian_province create class canadian_territory create class US_state etc. Populate the various classes with instances: create node for Alsace and make it an instance of french_region create node for Aquitaine and make it an instance of french_region create node for Alberta and make it an instance of Canadian_province create node for Nanavut and make it an instance of Canadian_territory create node for British Columbia and make it an instance of Canadian_province create node for Alabama and make it an instance of US_State create node for Alaska and make it an instance of US_State etc. We'd like to state that each country has its own restriction on the type of subdivision. To do that we need to create classes for each country. create class France create class Canada create class United_States_of_America etc. make class France a subclass of Country make class Canada a subclass of Country make class United_States_of_America a subclass of Country etc. create restriction for France on has-sub-division with range french_province and cardinality = 26 create restriction for Canada on has-sub-division with range canadian_province and cardinality = 10 create restriction for Canada on has-sub-division with range canadian_territory and cardinality = 3 create restriction for United_States_of_America on has-sub-division with range US_State and cardinality = 50 etc. Now we have classes for each country, but no instances. Unfortunately we cannot say that a class is an instance of itself, that would require a relationsship where the endnode equals the startnode, which Neo4J doesn't allow. So we have to create separate instances for each country. create node for France create node for Canada create node for United_States_Of_America etc. make node France an instance of class France make node Canada an instance of class Canada make node United_States_of_America an instance of class United_States_of_America And finally link the subdivisions to their countries: France has-sub-division Alsace France
Re: [Neo] meta meta classes
2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com Your point about the cardinality restriction is a correct observation. In fact it would be better to create a is-subdivision-of PropertyType on sub-division and give that a range country with a cardinality of 1. Then for each subclass of sub-division a restriction should be set, naming the country class this specific sub-division class applies to. Still, it requires each country to be defined as both a class and an instance. Why just countries as classes? why not each subdivision as classes as well? Why countries as classes at all? Date: Thu, 8 Apr 2010 19:15:30 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes So, you describe the model (with country, sub-division and has-sub-division) which is OK! Then you not just want to add data which would conform to it, you also describe the highest level of that data in the meta model itself with cardinality for how many sub-divisions each such level must contain. My first though here is: why put actual data into the meta model (the design isn't intended for that)? The second is: why (since you put actual data into the meta model) would you stop there? why only say that France have 10 subdivisions? You don't say exactly which subdivisions or how many/which subdivisions each subdivision has, a.s.o. Which benefits would modeling only the highest level of data get you? And if you could describe the entire data in the meta model, you end up with a meta model which describes the entire data and then, in addition, the data which is exactly the same as the meta model... what benefits does that give you? I'm just confused about the fact that you want to have the meta model (classes, properties, restrictions) and (some of) the actual data modeled in the meta model, whereas the meta model was intended to model the meta model (the UML diagram, so to speak) and not the actual data it would have conform to it. Help me understand which benefits you're after by modeling the top level of your actual data into the meta model itself. Best, Mattias 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com The example of the tag library and countries/sub-divisions are not necessarily similar. The first shows the need to model the properties of a class. The second example shows the need to have singleton classes, which is a different concept, and something that cannot be done out of the box, but I will show a solution that requires minimal modifications to the current software. Suppose we want to model countries and their sub-divisions. We do the following: create class country create class sub-division create property has-sub-division make has-subdivision a property of country Now we like to populate the database, since all countries have different types of sub-divisions we need to create those: create class french_region create class canadian_province create class canadian_territory create class US_state etc. Populate the various classes with instances: create node for Alsace and make it an instance of french_region create node for Aquitaine and make it an instance of french_region create node for Alberta and make it an instance of Canadian_province create node for Nanavut and make it an instance of Canadian_territory create node for British Columbia and make it an instance of Canadian_province create node for Alabama and make it an instance of US_State create node for Alaska and make it an instance of US_State etc. We'd like to state that each country has its own restriction on the type of subdivision. To do that we need to create classes for each country. create class France create class Canada create class United_States_of_America etc. make class France a subclass of Country make class Canada a subclass of Country make class United_States_of_America a subclass of Country etc. create restriction for France on has-sub-division with range french_province and cardinality = 26 create restriction for Canada on has-sub-division with range canadian_province and cardinality = 10 create restriction for Canada on has-sub-division with range canadian_territory and cardinality = 3 create restriction for United_States_of_America on has-sub-division with range US_State and cardinality = 50 etc. Now we have classes for each country, but no instances. Unfortunately we cannot say that a class is an instance of itself, that would require a relationsship where the endnode equals the startnode, which Neo4J doesn't allow. So we have to create separate instances for each country. create node for France create node for Canada create node for United_States_Of_America etc.
Re: [Neo] meta meta classes
Each country needs to be modeled as classes, because I want to set the restriction that French regions (which can have different properties from Canadian provinces) can only have a relationship with the country France, and Canadian provinces can only have a relationship with the country Canada. The domain and the range of a PropertyType are classes not instances. If countries were simply instances of the country class, it would be possible to say that an instance of a Canadian province is a subdivision of France. I'd like to be able to iterate over the subdivision of France and have guaranteed that each instance has the property region code, a property unknown to Canadian provinces. Without having a restriction stating that a specific sub-division belongs to a specific country, any sub-division can be related to any country, so a user may erroneously say that Alberta is a French region. Not only is this factually incorrect, but structurally too. Alberta, being a Canadian province, doesn't have the region code property, which I want French regions to have. Date: Thu, 8 Apr 2010 20:06:32 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com Your point about the cardinality restriction is a correct observation. In fact it would be better to create a is-subdivision-of PropertyType on sub-division and give that a range country with a cardinality of 1. Then for each subclass of sub-division a restriction should be set, naming the country class this specific sub-division class applies to. Still, it requires each country to be defined as both a class and an instance. Why just countries as classes? why not each subdivision as classes as well? Why countries as classes at all? Date: Thu, 8 Apr 2010 19:15:30 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes So, you describe the model (with country, sub-division and has-sub-division) which is OK! Then you not just want to add data which would conform to it, you also describe the highest level of that data in the meta model itself with cardinality for how many sub-divisions each such level must contain. My first though here is: why put actual data into the meta model (the design isn't intended for that)? The second is: why (since you put actual data into the meta model) would you stop there? why only say that France have 10 subdivisions? You don't say exactly which subdivisions or how many/which subdivisions each subdivision has, a.s.o. Which benefits would modeling only the highest level of data get you? And if you could describe the entire data in the meta model, you end up with a meta model which describes the entire data and then, in addition, the data which is exactly the same as the meta model... what benefits does that give you? I'm just confused about the fact that you want to have the meta model (classes, properties, restrictions) and (some of) the actual data modeled in the meta model, whereas the meta model was intended to model the meta model (the UML diagram, so to speak) and not the actual data it would have conform to it. Help me understand which benefits you're after by modeling the top level of your actual data into the meta model itself. Best, Mattias 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com The example of the tag library and countries/sub-divisions are not necessarily similar. The first shows the need to model the properties of a class. The second example shows the need to have singleton classes, which is a different concept, and something that cannot be done out of the box, but I will show a solution that requires minimal modifications to the current software. Suppose we want to model countries and their sub-divisions. We do the following: create class country create class sub-division create property has-sub-division make has-subdivision a property of country Now we like to populate the database, since all countries have different types of sub-divisions we need to create those: create class french_region create class canadian_province create class canadian_territory create class US_state etc. Populate the various classes with instances: create node for Alsace and make it an instance of french_region create node for Aquitaine and make it an instance of french_region create node for Alberta and make it an instance of Canadian_province create node for Nanavut and make it an instance of Canadian_territory create node for British Columbia and make it an instance of Canadian_province create node for Alabama and make it an instance of US_State
Re: [Neo] Traversers in the REST API
What I want to avoid is keeping state on the server while waiting for the client to request the next page. You are quite right. However, I think for many use cases (e.g. generating a paginated list of results on a webpage) it would not be necessary to store state on the server. That would be more similar to a SQL cursor, what I am talking about is simply SQL LIMIT, OFFSET and ORDER BY. Cheers Al On 8 April 2010 17:23, Tobias Ivarsson tobias.ivars...@neotechnology.comwrote: What I want to avoid is keeping state on the server while waiting for the client to request the next page. -- Dr Alastair James CTO James Publishing Ltd. http://www.linkedin.com/pub/3/914/163 www.worldreviewer.com WINNER Travolution Awards Best Travel Information Website 2009 WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008 WINNER Travolution Awards Best New Online Travel Company 2008 WINNER Travel Weekly Magellan Award 2008 WINNER Yahoo! Finds of the Year 2007 Noli nothis permittere te terere! ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Traversers in the REST API
Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200) [Re: [Neo] Traversers in the REST API]: On Wed, Apr 7, 2010 at 3:05 PM, Alastair James al.ja...@gmail.com wrote: when we start talking about returning 1000s of nodes in JSON over HTTP just to get the first 10 this is clearly sub-optimal (as I build websites this is a very common use case). So, as you say, sorting and limiting can wait, but I suspect the HTTP API would benefit from offering it. Limiting need not require changes to the core API, it could be implemented as a second stage in the HTTP API code prior to output encoding. For paging / limiting: yes, you are absolutely right, this would not effect the core API at all, only the REST API. Limiting/paging is something we would probably add to the REST API before sorting. Limiting and paging usually go hand in hand with sorting, in my experience. Why would anyone want to page through an unsorted collection? Sorting might be a similar case, but I still think the client would be better fitted to do sorting well. The server has indexes to support the sorting. (If it doesn't, it has a problem anyway.) What does the client have to support sorting? So how would it be better fitted to do sorting well? But once paging / limiting is added it would be quite natural / useful to add sorting as well. What I want to avoid is keeping state on the server while waiting for the client to request the next page. If you ensure a binary tree index is used to do the sorting, you should be fine. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] meta meta classes
Ok now I get your point! Thank you for clarifying. Your singleton proposal could be a good idea then. Could it potentially be a hindrance in some scenario? I mean should we have a MetaModelClass#setSingleton(boolean) or something so that this behaviour can be controlled? 2010/4/8, Niels Hoogeveen pd_aficion...@hotmail.com: Each country needs to be modeled as classes, because I want to set the restriction that French regions (which can have different properties from Canadian provinces) can only have a relationship with the country France, and Canadian provinces can only have a relationship with the country Canada. The domain and the range of a PropertyType are classes not instances. If countries were simply instances of the country class, it would be possible to say that an instance of a Canadian province is a subdivision of France. I'd like to be able to iterate over the subdivision of France and have guaranteed that each instance has the property region code, a property unknown to Canadian provinces. Without having a restriction stating that a specific sub-division belongs to a specific country, any sub-division can be related to any country, so a user may erroneously say that Alberta is a French region. Not only is this factually incorrect, but structurally too. Alberta, being a Canadian province, doesn't have the region code property, which I want French regions to have. Date: Thu, 8 Apr 2010 20:06:32 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com Your point about the cardinality restriction is a correct observation. In fact it would be better to create a is-subdivision-of PropertyType on sub-division and give that a range country with a cardinality of 1. Then for each subclass of sub-division a restriction should be set, naming the country class this specific sub-division class applies to. Still, it requires each country to be defined as both a class and an instance. Why just countries as classes? why not each subdivision as classes as well? Why countries as classes at all? Date: Thu, 8 Apr 2010 19:15:30 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes So, you describe the model (with country, sub-division and has-sub-division) which is OK! Then you not just want to add data which would conform to it, you also describe the highest level of that data in the meta model itself with cardinality for how many sub-divisions each such level must contain. My first though here is: why put actual data into the meta model (the design isn't intended for that)? The second is: why (since you put actual data into the meta model) would you stop there? why only say that France have 10 subdivisions? You don't say exactly which subdivisions or how many/which subdivisions each subdivision has, a.s.o. Which benefits would modeling only the highest level of data get you? And if you could describe the entire data in the meta model, you end up with a meta model which describes the entire data and then, in addition, the data which is exactly the same as the meta model... what benefits does that give you? I'm just confused about the fact that you want to have the meta model (classes, properties, restrictions) and (some of) the actual data modeled in the meta model, whereas the meta model was intended to model the meta model (the UML diagram, so to speak) and not the actual data it would have conform to it. Help me understand which benefits you're after by modeling the top level of your actual data into the meta model itself. Best, Mattias 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com The example of the tag library and countries/sub-divisions are not necessarily similar. The first shows the need to model the properties of a class. The second example shows the need to have singleton classes, which is a different concept, and something that cannot be done out of the box, but I will show a solution that requires minimal modifications to the current software. Suppose we want to model countries and their sub-divisions. We do the following: create class country create class sub-division create property has-sub-division make has-subdivision a property of country Now we like to populate the database, since all countries have different types of sub-divisions we need to create those: create class french_region create class canadian_province create class canadian_territory create class US_state etc. Populate the various classes with instances: create node for Alsace and make it an instance of french_region create node
Re: [Neo] Date effectiveness (Time Variance) implementation in Neo4J
suryadev vasudev schrieb am 06.04.2010 um 23:26:35 (-0700) [[Neo] Date effectiveness (Time Variance) implementation in Neo4J]: We are exploring Neo4J for a resource management application. [ straightforward requirements list without any discernible graph specifica snipped ] In Neo4J, we created Library, Book-Club, Publisher, Student and Books. We are finding it difficult to implement the time variance. Oh, that ... The business requirements are:- 1. The book publisher can lease books till his end registering date 2. Publisher can specify lease start date and end date for each book 3. Do not lend beyond end leasing date 4. Do not lend beyond end membership date 5. Query Student-book relationships (What books were borrowed/ reserved, who was the publisher, what was the book club) for a given date range How do we model the date in Neo4J? Heretical counter-question: Why model the date in Neo4J if any SQL database provides full-spectrum date-time functionality? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
Alastair James schrieb am 07.04.2010 um 15:53:50 (+0100) [[Neo] How to efficiently query in Neo4J?]: Briefly, the site consists of posts, each tagged with various attributes, e.g. (its a travel site) location, theme, cost etc... Also the tags are hierarchical. So, for location we have (say) 'tuscany' inside 'italy' inside 'europe'. For theme we have (say) 'cycling' inside 'activity'. After giving this some thought, it looks to me as if there is nothing particularly graphy in your example. I know, most everything is a graph, but here the data is more regular: Your hierarchical catalog of tags immediately made me think of Joe Celko's nested sets, which is a very efficient way to represent trees in terms of sets, as found in SQL databases. (Heresy again, I know, but well.) And the relationship of posts to tags is simply N-M, and that's it. There aren't any real links (edges) between posts, which arguably would make your data model more graphy. In your model, related posts are related by virtue of their attributes (they share some tags, or are posted by the same user), and not eis ipsis. So I'd say there is not much in the way of graphiness. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] meta meta classes
I think the best solution here is to have an instance enumeration on MetaModelClass. Singletons are special case of an enumeration. See: http://www.w3.org/TR/2002/WD-owl-ref-20021112/#Enumerated http://owl.cs.manchester.ac.uk/2007/05/api/javadoc/org/semanticweb/owl/model/OWLObjectOneOf.html Date: Thu, 8 Apr 2010 22:33:11 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes Ok now I get your point! Thank you for clarifying. Your singleton proposal could be a good idea then. Could it potentially be a hindrance in some scenario? I mean should we have a MetaModelClass#setSingleton(boolean) or something so that this behaviour can be controlled? 2010/4/8, Niels Hoogeveen pd_aficion...@hotmail.com: Each country needs to be modeled as classes, because I want to set the restriction that French regions (which can have different properties from Canadian provinces) can only have a relationship with the country France, and Canadian provinces can only have a relationship with the country Canada. The domain and the range of a PropertyType are classes not instances. If countries were simply instances of the country class, it would be possible to say that an instance of a Canadian province is a subdivision of France. I'd like to be able to iterate over the subdivision of France and have guaranteed that each instance has the property region code, a property unknown to Canadian provinces. Without having a restriction stating that a specific sub-division belongs to a specific country, any sub-division can be related to any country, so a user may erroneously say that Alberta is a French region. Not only is this factually incorrect, but structurally too. Alberta, being a Canadian province, doesn't have the region code property, which I want French regions to have. Date: Thu, 8 Apr 2010 20:06:32 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com Your point about the cardinality restriction is a correct observation. In fact it would be better to create a is-subdivision-of PropertyType on sub-division and give that a range country with a cardinality of 1. Then for each subclass of sub-division a restriction should be set, naming the country class this specific sub-division class applies to. Still, it requires each country to be defined as both a class and an instance. Why just countries as classes? why not each subdivision as classes as well? Why countries as classes at all? Date: Thu, 8 Apr 2010 19:15:30 +0200 From: matt...@neotechnology.com To: user@lists.neo4j.org Subject: Re: [Neo] meta meta classes So, you describe the model (with country, sub-division and has-sub-division) which is OK! Then you not just want to add data which would conform to it, you also describe the highest level of that data in the meta model itself with cardinality for how many sub-divisions each such level must contain. My first though here is: why put actual data into the meta model (the design isn't intended for that)? The second is: why (since you put actual data into the meta model) would you stop there? why only say that France have 10 subdivisions? You don't say exactly which subdivisions or how many/which subdivisions each subdivision has, a.s.o. Which benefits would modeling only the highest level of data get you? And if you could describe the entire data in the meta model, you end up with a meta model which describes the entire data and then, in addition, the data which is exactly the same as the meta model... what benefits does that give you? I'm just confused about the fact that you want to have the meta model (classes, properties, restrictions) and (some of) the actual data modeled in the meta model, whereas the meta model was intended to model the meta model (the UML diagram, so to speak) and not the actual data it would have conform to it. Help me understand which benefits you're after by modeling the top level of your actual data into the meta model itself. Best, Mattias 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com The example of the tag library and countries/sub-divisions are not necessarily similar. The first shows the need to model the properties of a class. The second example shows the need to have singleton classes, which is a different concept, and something that cannot be done out of the box, but I will show a solution that requires minimal modifications to the current software. Suppose we want to model countries and their sub-divisions. We do the
Re: [Neo] How to efficiently query in Neo4J?
Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500) [Re: [Neo] How to efficiently query in Neo4J?]: You know this is something that I think needs to be made clear... using just the graph is not the right way to go unless you have a very special application. Some things are better not done in the graph. So I decided to keep that in tables, and just move the person relationships to the graph (works with, manages, knows, friends, etc). I treat the graph like a specialized index. Makes a lot more sense now, and I get the best of both worlds. Exactly what I think. An iterable index, and a great one for the kind of graphy queries that cannot be done efficiently using sets and joins. Any thoughts on what constitutes *graphiness*, if I may venture this term? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
As always, it really isn't that simple. Comparing cold queries is probably not a good indicator of steady state performance, since RDBMS's and Graph DB's have different models for file system access and caching. Even different RDBMS's have dramatically different behaviors in common queries (ever try to use MySQL for set operations - yuck.). Factor in a wide range of SLAs needed for performance vs availability vs affordability vs scalability vs adminstration costs, and the equation gets a whole lot more complicated. I'm sure there's a graphy-model for the tag/post example that could be made smoking fast with Neo also. Throw columnar storage, key-value, and document DB's into the mix, and the good news is that we have a lot of weapons in our arsenal now to tackle very demanding and diverse application challenges! Original Message Subject: Re: [Neo] How to efficiently query in Neo4J? From: Michael Ludwig mil...@gmx.de Date: Thu, April 08, 2010 6:02 pm To: Neo user discussions user@lists.neo4j.org Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500) [Re: [Neo] How to efficiently query in Neo4J?]: You know this is something that I think needs to be made clear... using just the graph is not the right way to go unless you have a very special application. Some things are better not done in the graph. So I decided to keep that in tables, and just move the person relationships to the graph (works with, manages, knows, friends, etc). I treat the graph like a specialized index. Makes a lot more sense now, and I get the best of both worlds. Exactly what I think. An iterable index, and a great one for the kind of graphy queries that cannot be done efficiently using sets and joins. Any thoughts on what constitutes *graphiness*, if I may venture this term? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org [1]https://lists.neo4j.org/mailman/listinfo/user References 1. https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
Hi... On 8 April 2010 22:35, Michael Ludwig mil...@gmx.de wrote: After giving this some thought, it looks to me as if there is nothing particularly graphy in your example. I know, most everything is a graph, but here the data is more regular: Your hierarchical catalog of tags immediately made me think of Joe Celko's nested sets, which is a very efficient way to represent trees in terms of sets, as found in SQL databases. (Heresy again, I know, but well.) And the relationship of posts to tags is simply N-M, and that's it. We are currently using something similar to model this is SQL. However, having to maintain the nested set model is quite complex and something I really want to avoid in user code. There aren't any real links (edges) between posts, which arguably would make your data model more graphy. In your model, related posts are related by virtue of their attributes (they share some tags, or are posted by the same user), and not eis ipsis. So I'd say there is not much in the way of graphiness. It was a simplified example, in reality there are relations between posts, posts and authors, tags and tags etc... It is exactly because we want 'anything to be relatable to anything' that the graph database model works so well. Al ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
As always, it really isn't that simple. Comparing cold queries is probably not a good indicator of steady state performance, since RDBMS's and Graph DB's have different models for file system access and caching. Even different RDBMS's have dramatically different behaviors in common queries (ever try to use MySQL for set operations - yuck.). Factor in a wide range of SLAs needed for performance vs availability vs affordability vs scalability vs adminstration costs, and the equation gets a whole lot more complicated. Exactly. From experience its possible to build a post/tag system in SQL that performs very well. However, the SQL model is inherently less flexible than the graph database model (what if I want to introduce a new relationship type, in a traditional SQL schema that would require new join tables etc...). I'm sure there's a graphy-model for the tag/post example that could be made smoking fast with Neo also. Hopefully! I suppose my question is I would like to be able to harness a graph database to give flexibility and eloquence to our data model. However, can I query it efficiently without domain specific hacks and extra layers of code?. Al Original Message Subject: Re: [Neo] How to efficiently query in Neo4J? From: Michael Ludwig mil...@gmx.de Date: Thu, April 08, 2010 6:02 pm To: Neo user discussions user@lists.neo4j.org Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500) [Re: [Neo] How to efficiently query in Neo4J?]: You know this is something that I think needs to be made clear... using just the graph is not the right way to go unless you have a very special application. Some things are better not done in the graph. So I decided to keep that in tables, and just move the person relationships to the graph (works with, manages, knows, friends, etc). I treat the graph like a specialized index. Makes a lot more sense now, and I get the best of both worlds. Exactly what I think. An iterable index, and a great one for the kind of graphy queries that cannot be done efficiently using sets and joins. Any thoughts on what constitutes *graphiness*, if I may venture this term? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org [1]https://lists.neo4j.org/mailman/listinfo/user References 1. https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Dr Alastair James CTO James Publishing Ltd. http://www.linkedin.com/pub/3/914/163 www.worldreviewer.com WINNER Travolution Awards Best Travel Information Website 2009 WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008 WINNER Travolution Awards Best New Online Travel Company 2008 WINNER Travel Weekly Magellan Award 2008 WINNER Yahoo! Finds of the Year 2007 Noli nothis permittere te terere! ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Traversers in the REST API
On 8 April 2010 21:17, Michael Ludwig mil...@gmx.de wrote: Limiting and paging usually go hand in hand with sorting, in my experience. Why would anyone want to page through an unsorted collection? Its quite possible that you might want the nodes in the order they were found (e.g. the closest matching nodes first), however, I agree, sorting by an arbitrary property is very useful! Al ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
rick.bullotta schrieb am 08.04.2010 um 15:16:11 (-0700) [Re: [Neo] How to efficiently query in Neo4J?]: Factor in a wide range of SLAs needed for performance vs availability vs affordability vs scalability vs adminstration costs, and the equation gets a whole lot more complicated. Granted. I'm sure there's a graphy-model for the tag/post example that could be made smoking fast with Neo also. Sure, but there's also a way of looking at screws that might suggest you should use a hammer ;-) and it would be wrong. Which doesn't mean it couldn't be modeled for the tag/post example - just a general caveat to think about both tools and problems when trying to find a good solution. Throw columnar storage, key-value, and document DB's into the mix, and the good news is that we have a lot of weapons in our arsenal now to tackle very demanding and diverse application challenges! Yes, it's becoming very interesting. Lots of new high-level tools for specialized or relaxed requirements. SQL won't be dethroned, though. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user