Re: [Neo] How to efficiently query in Neo4J?

2010-04-09 Thread Michael Ludwig
Alastair James schrieb am 09.04.2010 um 14:04:37 (+0100)
[Re: [Neo] How to efficiently query in Neo4J?]:

 So, I suppose this question boils down to, is there an efficient way
 to calculate the union of two traversals without retrieving all result
 sets and performing the union in user code?

No need for two traversals if you annotate your category tree in Neo4j
the same way Celko has popularized for SQL, i.e. marking each category
with *left* and *right*. It's really not a question of graph or sets,
as in both cases what you deal with is a tree.

http://intelligent-enterprise.informationweek.com/001020/celko.jhtml

Note that this needs some custom logic for category tree updates. But
it's not difficult in SQL, and I think it's not much more difficult in
Neo4j either.

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Traversers in the REST API

2010-04-08 Thread Michael Ludwig
Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200)
[Re: [Neo] Traversers in the REST API]:

 On Wed, Apr 7, 2010 at 3:05 PM, Alastair James al.ja...@gmail.com
 wrote:

  when we start talking about returning 1000s of nodes in JSON over
  HTTP just to get the first 10 this is clearly sub-optimal (as I
  build websites this is a very common use case). So, as you say,
  sorting and limiting can wait, but I suspect the HTTP API would
  benefit from offering it. Limiting need not require changes to the
  core API, it could be implemented as a second stage in the HTTP API
  code prior to output encoding.
 
 For paging / limiting: yes, you are absolutely right, this would not
 effect the core API at all, only the REST API. Limiting/paging is
 something we would probably add to the REST API before sorting.

Limiting and paging usually go hand in hand with sorting, in my
experience. Why would anyone want to page through an unsorted
collection?

 Sorting might be a similar case, but I still think the client would be
 better fitted to do sorting well.

The server has indexes to support the sorting. (If it doesn't, it has a
problem anyway.) What does the client have to support sorting? So how
would it be better fitted to do sorting well?

 But once paging / limiting is added it would be quite natural / useful
 to add sorting as well. What I want to avoid is keeping state on the
 server while waiting for the client to request the next page.

If you ensure a binary tree index is used to do the sorting, you should
be fine.

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Date effectiveness (Time Variance) implementation in Neo4J

2010-04-08 Thread Michael Ludwig
suryadev vasudev schrieb am 06.04.2010 um 23:26:35 (-0700)
[[Neo] Date effectiveness (Time Variance) implementation in Neo4J]:

 We are exploring Neo4J for a resource management application.

  [ straightforward requirements list without
any discernible graph specifica snipped ]

 In Neo4J, we created Library, Book-Club, Publisher, Student and Books.
 We are finding it difficult to implement the time variance.

Oh, that ...

 The business requirements are:-
 1. The book publisher can lease books till his end registering date
 2. Publisher can specify lease start date and end date for each book
 3. Do not lend beyond end leasing date
 4. Do not lend beyond end membership date
 5. Query Student-book relationships (What books were borrowed/
 reserved, who was the publisher, what was the book club) for a given
 date range
 
 How do we model the date in Neo4J?

Heretical counter-question:

Why model the date in Neo4J if any SQL database provides full-spectrum
date-time functionality?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Michael Ludwig
Alastair James schrieb am 07.04.2010 um 15:53:50 (+0100)
[[Neo] How to efficiently query in Neo4J?]:

 Briefly, the site consists of posts, each tagged with various
 attributes, e.g. (its a travel site) location, theme, cost etc... Also
 the tags are hierarchical. So, for location we have (say) 'tuscany'
 inside 'italy' inside 'europe'. For theme we have (say) 'cycling'
 inside 'activity'.

After giving this some thought, it looks to me as if there is nothing
particularly graphy in your example. I know, most everything is a graph,
but here the data is more regular: Your hierarchical catalog of tags
immediately made me think of Joe Celko's nested sets, which is a very
efficient way to represent trees in terms of sets, as found in SQL
databases. (Heresy again, I know, but well.) And the relationship of
posts to tags is simply N-M, and that's it.

There aren't any real links (edges) between posts, which arguably would
make your data model more graphy. In your model, related posts are
related by virtue of their attributes (they share some tags, or are
posted by the same user), and not eis ipsis. So I'd say there is not
much in the way of graphiness.

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Michael Ludwig
Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500)
[Re: [Neo] How to efficiently query in Neo4J?]:

 You know this is something that I think needs to be made clear...
 using just the graph is not the right way to go unless you have a very
 special application.

 Some things are better not done in the graph.  So I decided to keep
 that in tables, and just move the person relationships to the graph
 (works with, manages, knows, friends, etc).
 
 I treat the graph like a specialized index. Makes a lot more sense
 now, and I get the best of both worlds.

Exactly what I think. An iterable index, and a great one for the kind of
graphy queries that cannot be done efficiently using sets and joins.

Any thoughts on what constitutes *graphiness*, if I may venture this
term?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Michael Ludwig
rick.bullotta schrieb am 08.04.2010 um 15:16:11 (-0700)
[Re: [Neo] How to efficiently query in Neo4J?]:

 Factor in a wide range of SLAs needed for performance vs availability
 vs affordability vs scalability vs adminstration costs, and the
 equation gets a whole lot more complicated.

Granted.

 I'm sure there's a graphy-model for the tag/post example that could be
 made smoking fast with Neo also.

Sure, but there's also a way of looking at screws that might suggest you
should use a hammer ;-) and it would be wrong. Which doesn't mean it
couldn't be modeled for the tag/post example - just a general caveat to
think about both tools and problems when trying to find a good solution.

 Throw columnar storage, key-value, and document DB's into the mix, and
 the good news is that we have a lot of weapons in our arsenal now to
 tackle very demanding and diverse application challenges!

Yes, it's becoming very interesting. Lots of new high-level tools for
specialized or relaxed requirements.

SQL won't be dethroned, though.
-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Requirements for an event framework for Neo4j

2010-04-01 Thread Michael Ludwig
Laurent Laborde schrieb am 31.03.2010 um 13:52:52 (+0200):
 I don't remember the exact english name but...
 are you, in fact, planning some kind of stored function (like PLSQL in
 postgresql) ?
 
 (exemple of stored function  : BEFORE INSERT ON something FOR EACH
 ROW EXECUTE someFunction() )

I think what you're referring to here is *triggers* (as common in SQL
databases), which react on events, not dissimilar to what has been
outlined by Tobias in the mail you're replying to.
-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] XPath in REST API

2010-03-30 Thread Michael Ludwig
Mattias Persson schrieb am 30.03.2010 um 15:02:19 (+0200):
 We're discussing how to expose traversers in the REST API. One of the
 ideas that was brought up (more emails with the rest of the ideas are
 coming) was to use xpath directly in the URIs.

I have some experience working with XSLT and XPath, and I'm probably
missing the context here, so I'm wondering:

You're probably just considering using some subset of XPath? Like self,
child and attribute axes? Because even 1.0 gives you considerable power
in navigating trees [1], not to mention 2.0, which has conditional
branching, loops and lots more.

Also, XPath being for trees, do you constrain the graph to tree form?

[1] http://www.xmlplease.com/axis
-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Traversers in the REST API

2010-03-30 Thread Michael Ludwig
Mattias Persson schrieb am 30.03.2010 um 16:06:49 (+0200):

 a JSON document describing the traverser, like:
 
   { order: depth first,
 uniquness: node,
 return evaluator:
{ language: javascript,
  body: function shouldReturn( pos ) {...} },
 prune evaluator:
{ language: javascript,
  body: function },
 relationships: [
{ direction: outgoing,
  type: KNOWS },
{ type: LOVES }
 ],
 max depth: 4 }

 Looking at the prune evaluator and return evaluator it'd be nice
 to define them in some language, f.ex javascript, ruby or python or
 whatever. We're initially thinking of using javax.script.* stuff
 (ScriptEngine) for that, it'd probably be enough, at least to get
 things going.

XSLT, which builds on XPath, works by having the processor traverse the
tree and the user define templates featuring a match pattern. For every
node, the processor dispatches to the best matching template, from where
you can control further processing.

Now those match patterns are a subset of XPath, and rightly so: If the
user were given the full power of XPath, it would easily get horribly
expensive to determine the best matching template for a given node.

Likewise in a graph traversal, wouldn't it be reasonable to only allow
something with restricted expressive and imperative power, like the
match patterns in XSLT?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] XPath in REST API

2010-03-30 Thread Michael Ludwig
Hi Marko,

Marko Rodriguez schrieb am 30.03.2010 um 12:50:21 (-0600):
  Also, XPath being for trees, do you constrain the graph to tree
  form?
 
 XPath easily generalizes to work for graphs. See
 http://gremlin.tinkerpop.com and more specifically,
 http://wiki.github.com/tinkerpop/gremlin/basic-graph-traversals ...
 However, the // recursion operator can get out of control.

That's very interesting!

What about cycles?

  A --knows-- B --knows-- C --knows-- A# or
  A --knows-- B --knows-- C --knows-- B

Will the traverser follow these? Or does it maintain a map of seen edges
and/or vertices so it will avoid cycles?

Also, this raises the question of traversal order: Let's assume that
A --knows-- B, C and D. Is there an order specified for going along the
edges, such as *document order* in XSLT? Or are edges specified to be
unordered, such as attributes in XDM (XPath Data Model)?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] XPath in REST API

2010-03-30 Thread Michael Ludwig
Hi Marko,

Marko Rodriguez schrieb am 30.03.2010 um 14:42:44 (-0600):
 When doing //, it remembers previously seen elements and then halts
 that particular path when that element has been seen again. However,
 there are many many many paths in any complex enough graph (so usually
 this is just a memory and time explosion). Thus, // is usually avoided
 --- instead a while/foreach/repeat loop is usually opted for.

Understandably so! Do you lend meaning to the other axes defined in
XPath? For example parents, ancestors, following and preceding siblings?
I'm struggling to see how all that would map from a tree to a general
graph.

  Also, this raises the question of traversal order: Let's assume that
  A --knows-- B, C and D. Is there an order specified for going along
  the edges, such as *document order* in XSLT? Or are edges specified
  to be unordered, such as attributes in XDM (XPath Data Model)?
 
 The traversal order is determined by how the underlying graph database
 serves up its results. So its different for different graph databases
 having the same graph data.

So it's implementation-defined, which is random from the POV of a
specification. And I can't see how edge order could be specified. For
me, a graph newbie, there does not seem to be anything inherent in a
graph that suggests an order of edges.

That's problematic in that it will lead to non-deterministic behaviour:
Traversal halting points (and hence the list of edges and vertices being
traversed) are up to the implementation, so you cannot use this to
declaratively express a result to compute. Well, you can, but what
you'll see won't be what you'll get - you'll be at the mercy of the
implementation du jour.

Of course, you could still make it mandatory for the user to declare
(and the implementation to define and adhere to!) an algorithm to
unambiguously determine traversal order without relying on nitty-gritty
implementation details (such as object id). I just cannot see what would
be natural to a graph. I think it would have to be something on the data
layer, or a meta attribute, so that you get sortable edges.

XSLT and XQuery are built on the XDM (XPath Data Model), which is an
abstraction of an XML document. In fact, it's an ordered tree. Is there
something like that for graphs that you could (or do) base your graph
query language on?

Thanks!
-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] XPath in REST API

2010-03-30 Thread Michael Ludwig
Hi Marko,

Marko Rodriguez schrieb am 30.03.2010 um 16:01:42 (-0600):
 Ha. There is a Gremlin mailing list if you are **super** interested
 :). http://groups.google.com/group/gremlin-users ... However, don't
 even try and join unless you are SUPER interested. :P

Thanks - too late, I already joined :-) Will be lurking, for the moment.
I hope not to annoy people too much by continuing what has started here.

 Exactly. People tend to forget this point---thats why I stress it.
 Vertices are adjacent to edges and edges are adjacent to vertices.
 HyperGraphDB [ http://www.kobrix.com/hgdb.jsp ] throws that
 distinction out of the water and says all there is are 'atoms' and
 'atoms' can be adjacent to each other. By making a distinction between
 edges and vertices, you are saying that an edge is a binary
 relationship between two vertices---this makes it a regular graph as
 opposed to a hypergraph. Neo4j/Gremlin/RDF/and_lots_others are regular
 graphs.

Thanks for explaining. I wasn't aware of hypergraphs. Sounds pretty
experimental.

  I understand outE, inE and bothE. Edges may have a direction (or
  two, or none - depending on the point of view), so in/out/both looks
  like the logical thing to do.
  
  But what about outV and inV? Vertices aren't directional, are they?
 
 Yea---outV means the outgoing vertex from the edge (the tail of the
 edge). inV means the incoming vertex of the edge (the head of the
 edge). In Neo4j speak, its startNode and endNode, respectively.

Excuse my insolence, but couldn't you simplify by letting the user say:

  outE/V# bound to be an inV
  inE/V # bound to be an outV

  outE/outV # confusing way of saying (in XPath) self::node()
  inE/inV   # ditto

Best,
-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user