Re: [Neo] How to efficiently query in Neo4J?
Alastair James schrieb am 09.04.2010 um 14:04:37 (+0100) [Re: [Neo] How to efficiently query in Neo4J?]: So, I suppose this question boils down to, is there an efficient way to calculate the union of two traversals without retrieving all result sets and performing the union in user code? No need for two traversals if you annotate your category tree in Neo4j the same way Celko has popularized for SQL, i.e. marking each category with *left* and *right*. It's really not a question of graph or sets, as in both cases what you deal with is a tree. http://intelligent-enterprise.informationweek.com/001020/celko.jhtml Note that this needs some custom logic for category tree updates. But it's not difficult in SQL, and I think it's not much more difficult in Neo4j either. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Traversers in the REST API
Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200) [Re: [Neo] Traversers in the REST API]: On Wed, Apr 7, 2010 at 3:05 PM, Alastair James al.ja...@gmail.com wrote: when we start talking about returning 1000s of nodes in JSON over HTTP just to get the first 10 this is clearly sub-optimal (as I build websites this is a very common use case). So, as you say, sorting and limiting can wait, but I suspect the HTTP API would benefit from offering it. Limiting need not require changes to the core API, it could be implemented as a second stage in the HTTP API code prior to output encoding. For paging / limiting: yes, you are absolutely right, this would not effect the core API at all, only the REST API. Limiting/paging is something we would probably add to the REST API before sorting. Limiting and paging usually go hand in hand with sorting, in my experience. Why would anyone want to page through an unsorted collection? Sorting might be a similar case, but I still think the client would be better fitted to do sorting well. The server has indexes to support the sorting. (If it doesn't, it has a problem anyway.) What does the client have to support sorting? So how would it be better fitted to do sorting well? But once paging / limiting is added it would be quite natural / useful to add sorting as well. What I want to avoid is keeping state on the server while waiting for the client to request the next page. If you ensure a binary tree index is used to do the sorting, you should be fine. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Date effectiveness (Time Variance) implementation in Neo4J
suryadev vasudev schrieb am 06.04.2010 um 23:26:35 (-0700) [[Neo] Date effectiveness (Time Variance) implementation in Neo4J]: We are exploring Neo4J for a resource management application. [ straightforward requirements list without any discernible graph specifica snipped ] In Neo4J, we created Library, Book-Club, Publisher, Student and Books. We are finding it difficult to implement the time variance. Oh, that ... The business requirements are:- 1. The book publisher can lease books till his end registering date 2. Publisher can specify lease start date and end date for each book 3. Do not lend beyond end leasing date 4. Do not lend beyond end membership date 5. Query Student-book relationships (What books were borrowed/ reserved, who was the publisher, what was the book club) for a given date range How do we model the date in Neo4J? Heretical counter-question: Why model the date in Neo4J if any SQL database provides full-spectrum date-time functionality? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
Alastair James schrieb am 07.04.2010 um 15:53:50 (+0100) [[Neo] How to efficiently query in Neo4J?]: Briefly, the site consists of posts, each tagged with various attributes, e.g. (its a travel site) location, theme, cost etc... Also the tags are hierarchical. So, for location we have (say) 'tuscany' inside 'italy' inside 'europe'. For theme we have (say) 'cycling' inside 'activity'. After giving this some thought, it looks to me as if there is nothing particularly graphy in your example. I know, most everything is a graph, but here the data is more regular: Your hierarchical catalog of tags immediately made me think of Joe Celko's nested sets, which is a very efficient way to represent trees in terms of sets, as found in SQL databases. (Heresy again, I know, but well.) And the relationship of posts to tags is simply N-M, and that's it. There aren't any real links (edges) between posts, which arguably would make your data model more graphy. In your model, related posts are related by virtue of their attributes (they share some tags, or are posted by the same user), and not eis ipsis. So I'd say there is not much in the way of graphiness. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500) [Re: [Neo] How to efficiently query in Neo4J?]: You know this is something that I think needs to be made clear... using just the graph is not the right way to go unless you have a very special application. Some things are better not done in the graph. So I decided to keep that in tables, and just move the person relationships to the graph (works with, manages, knows, friends, etc). I treat the graph like a specialized index. Makes a lot more sense now, and I get the best of both worlds. Exactly what I think. An iterable index, and a great one for the kind of graphy queries that cannot be done efficiently using sets and joins. Any thoughts on what constitutes *graphiness*, if I may venture this term? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] How to efficiently query in Neo4J?
rick.bullotta schrieb am 08.04.2010 um 15:16:11 (-0700) [Re: [Neo] How to efficiently query in Neo4J?]: Factor in a wide range of SLAs needed for performance vs availability vs affordability vs scalability vs adminstration costs, and the equation gets a whole lot more complicated. Granted. I'm sure there's a graphy-model for the tag/post example that could be made smoking fast with Neo also. Sure, but there's also a way of looking at screws that might suggest you should use a hammer ;-) and it would be wrong. Which doesn't mean it couldn't be modeled for the tag/post example - just a general caveat to think about both tools and problems when trying to find a good solution. Throw columnar storage, key-value, and document DB's into the mix, and the good news is that we have a lot of weapons in our arsenal now to tackle very demanding and diverse application challenges! Yes, it's becoming very interesting. Lots of new high-level tools for specialized or relaxed requirements. SQL won't be dethroned, though. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Requirements for an event framework for Neo4j
Laurent Laborde schrieb am 31.03.2010 um 13:52:52 (+0200): I don't remember the exact english name but... are you, in fact, planning some kind of stored function (like PLSQL in postgresql) ? (exemple of stored function : BEFORE INSERT ON something FOR EACH ROW EXECUTE someFunction() ) I think what you're referring to here is *triggers* (as common in SQL databases), which react on events, not dissimilar to what has been outlined by Tobias in the mail you're replying to. -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] XPath in REST API
Mattias Persson schrieb am 30.03.2010 um 15:02:19 (+0200): We're discussing how to expose traversers in the REST API. One of the ideas that was brought up (more emails with the rest of the ideas are coming) was to use xpath directly in the URIs. I have some experience working with XSLT and XPath, and I'm probably missing the context here, so I'm wondering: You're probably just considering using some subset of XPath? Like self, child and attribute axes? Because even 1.0 gives you considerable power in navigating trees [1], not to mention 2.0, which has conditional branching, loops and lots more. Also, XPath being for trees, do you constrain the graph to tree form? [1] http://www.xmlplease.com/axis -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Traversers in the REST API
Mattias Persson schrieb am 30.03.2010 um 16:06:49 (+0200): a JSON document describing the traverser, like: { order: depth first, uniquness: node, return evaluator: { language: javascript, body: function shouldReturn( pos ) {...} }, prune evaluator: { language: javascript, body: function }, relationships: [ { direction: outgoing, type: KNOWS }, { type: LOVES } ], max depth: 4 } Looking at the prune evaluator and return evaluator it'd be nice to define them in some language, f.ex javascript, ruby or python or whatever. We're initially thinking of using javax.script.* stuff (ScriptEngine) for that, it'd probably be enough, at least to get things going. XSLT, which builds on XPath, works by having the processor traverse the tree and the user define templates featuring a match pattern. For every node, the processor dispatches to the best matching template, from where you can control further processing. Now those match patterns are a subset of XPath, and rightly so: If the user were given the full power of XPath, it would easily get horribly expensive to determine the best matching template for a given node. Likewise in a graph traversal, wouldn't it be reasonable to only allow something with restricted expressive and imperative power, like the match patterns in XSLT? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] XPath in REST API
Hi Marko, Marko Rodriguez schrieb am 30.03.2010 um 12:50:21 (-0600): Also, XPath being for trees, do you constrain the graph to tree form? XPath easily generalizes to work for graphs. See http://gremlin.tinkerpop.com and more specifically, http://wiki.github.com/tinkerpop/gremlin/basic-graph-traversals ... However, the // recursion operator can get out of control. That's very interesting! What about cycles? A --knows-- B --knows-- C --knows-- A# or A --knows-- B --knows-- C --knows-- B Will the traverser follow these? Or does it maintain a map of seen edges and/or vertices so it will avoid cycles? Also, this raises the question of traversal order: Let's assume that A --knows-- B, C and D. Is there an order specified for going along the edges, such as *document order* in XSLT? Or are edges specified to be unordered, such as attributes in XDM (XPath Data Model)? -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] XPath in REST API
Hi Marko, Marko Rodriguez schrieb am 30.03.2010 um 14:42:44 (-0600): When doing //, it remembers previously seen elements and then halts that particular path when that element has been seen again. However, there are many many many paths in any complex enough graph (so usually this is just a memory and time explosion). Thus, // is usually avoided --- instead a while/foreach/repeat loop is usually opted for. Understandably so! Do you lend meaning to the other axes defined in XPath? For example parents, ancestors, following and preceding siblings? I'm struggling to see how all that would map from a tree to a general graph. Also, this raises the question of traversal order: Let's assume that A --knows-- B, C and D. Is there an order specified for going along the edges, such as *document order* in XSLT? Or are edges specified to be unordered, such as attributes in XDM (XPath Data Model)? The traversal order is determined by how the underlying graph database serves up its results. So its different for different graph databases having the same graph data. So it's implementation-defined, which is random from the POV of a specification. And I can't see how edge order could be specified. For me, a graph newbie, there does not seem to be anything inherent in a graph that suggests an order of edges. That's problematic in that it will lead to non-deterministic behaviour: Traversal halting points (and hence the list of edges and vertices being traversed) are up to the implementation, so you cannot use this to declaratively express a result to compute. Well, you can, but what you'll see won't be what you'll get - you'll be at the mercy of the implementation du jour. Of course, you could still make it mandatory for the user to declare (and the implementation to define and adhere to!) an algorithm to unambiguously determine traversal order without relying on nitty-gritty implementation details (such as object id). I just cannot see what would be natural to a graph. I think it would have to be something on the data layer, or a meta attribute, so that you get sortable edges. XSLT and XQuery are built on the XDM (XPath Data Model), which is an abstraction of an XML document. In fact, it's an ordered tree. Is there something like that for graphs that you could (or do) base your graph query language on? Thanks! -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] XPath in REST API
Hi Marko, Marko Rodriguez schrieb am 30.03.2010 um 16:01:42 (-0600): Ha. There is a Gremlin mailing list if you are **super** interested :). http://groups.google.com/group/gremlin-users ... However, don't even try and join unless you are SUPER interested. :P Thanks - too late, I already joined :-) Will be lurking, for the moment. I hope not to annoy people too much by continuing what has started here. Exactly. People tend to forget this point---thats why I stress it. Vertices are adjacent to edges and edges are adjacent to vertices. HyperGraphDB [ http://www.kobrix.com/hgdb.jsp ] throws that distinction out of the water and says all there is are 'atoms' and 'atoms' can be adjacent to each other. By making a distinction between edges and vertices, you are saying that an edge is a binary relationship between two vertices---this makes it a regular graph as opposed to a hypergraph. Neo4j/Gremlin/RDF/and_lots_others are regular graphs. Thanks for explaining. I wasn't aware of hypergraphs. Sounds pretty experimental. I understand outE, inE and bothE. Edges may have a direction (or two, or none - depending on the point of view), so in/out/both looks like the logical thing to do. But what about outV and inV? Vertices aren't directional, are they? Yea---outV means the outgoing vertex from the edge (the tail of the edge). inV means the incoming vertex of the edge (the head of the edge). In Neo4j speak, its startNode and endNode, respectively. Excuse my insolence, but couldn't you simplify by letting the user say: outE/V# bound to be an inV inE/V # bound to be an outV outE/outV # confusing way of saying (in XPath) self::node() inE/inV # ditto Best, -- Michael Ludwig ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user