Re: [Neo4j] REST API thoughts/questions/feedback

2011-04-19 Thread Michael DeHaan
On Tue, Apr 19, 2011 at 10:48 AM, Jacob Hansson  wrote:
> Hey Michael, big thanks again for taking the time to write down your
> experiences working with the REST API.
>
> See inline response.

Thanks for the follow up.  That's quite helpful and let's me know I'm
not doing the unique-key-implementation in too much of
a non-idiomatic way.

I'll get back with you about doc fixes and should the bindings
materialize further, I'll share some examples.

--Michael
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] REST API thoughts/questions/feedback

2011-04-19 Thread Jacob Hansson
Hey Michael, big thanks again for taking the time to write down your
experiences working with the REST API.

See inline response.

On Mon, Apr 18, 2011 at 4:10 PM, Michael DeHaan wrote:

> Hi all.
>
> I've been working recently on writing a Perl binding for the Neo4j
> REST API and thought I'd share some observations, and hopefully can
> get a few suggestions on some things.  You can see some of the Perl
> work in progress here -- https://github.com/mpdehaan/Elevator (search
> for *Neo*.pm).  Basically it's a data later that allows objects to be
> plugged between Sql, NoSql (Riak and Mongo so far) and Neo4j.
>
> The idea is we can build data classes and just call "commit()" and the
> like on them, though if the class is backed by Neo4j obviously we'll
> be able to add links between them, query the links, and so forth.  I'm
> still working on that.
>
> Basically the REST API *is* working for me, but here are my
> observations about it:
>
> (1)  I'd like to be able to be able to specify the node ID of a node
> before I create it.  I like having a primary key, as I can do with
> things like Mongo and Riak.  If I do not have a primary key, I have to
> search before I add, "upsert" becomes difficult, as do deletions, and
> I have to worry about which copy of a given object is authorative.
> I understand this can't work for everyone but seems like it would be
> useful. If that can be done now, I'd love info on how to!
>

I think the current "standard" approach to key/value storage is, like you
mention, to store unique keys in an index. This does mean you have to build
"upsert" abstractions yourself, always doing an index lookup before inserts
or updates.

As far as allowing neo4j clients to set ids for nodes, I think the problems
that would create (for instance in High Availability setups where each slave
gets a set of ids it can assign) seems like they would outweigh the
benefits.


>
> (2)  I'd like a delete_all find of API to be able to delete all nodes
> matching a particular criteria versus having to do a search.   For
> instance, I may need to rebuild the database, or part of it, and it's
> not clear on how to drop it.   Also, is there a way to drop the entire
> database via REST?
>

This feels like a two-part idea, both of which I like :)

First, the ability to do manipulating operations like deleting and/or
editing data on a large scale without having to pull down each node over
http would be awesome. There is talk about putting together a query
language, and that could potentially be outfitted to do mutating operations,
similar to how SQL was extended to do that. Will definately keep this in
mind!

Second, the ability to "nuke" the database I think is a great thing to have
in a development environment. A feature we're discussing is the ability to
have multiple databases running in each neo4j server, allowing you to nuke
and create databases as appropriate.

For a faster fix, take a look at Michael Hungers db-nuker plugin:
https://github.com/jexp/neo4j-clean-remote-db-addon


> (3)  I'd like to be able to have the "key" of the node automatically
> added to the index without having to make a second call.  Ideally I'd
> like to be able to configure the server to auto-index certain fields,
> which is something some of the NoSQL/search tools offer. Similarly,
> when updating the node, the index should auto update without an
> explicit call to the indexer.
>

Agreed, auto-indexing would be *awesome*. There are some hard problems
related to doing auto indexing *well* that need to be solved first, but this
is something that I really hope we will end up implementing.


>
> (4)  The capability to do an "upsert" would be very useful, create a
> node if it exists for the given key, if not, update it.
>

Like I said above, the current approach I think is to put this logic on the
client side, which is slower, but the logic for doing this without
user-defined key-value style ids would potentially be very complex. I might
be wrong, but it my gut feeling is that we can't do this well if we don't
have user-defined ids.


>
> (5)   It seems the indexes are the only means of search?   If I need
> to search on a field that isn't indexed (say in production, I need to
> add a new index), how do I go about adding it for all the nodes that
> need to be added to the index *to* that index?   It seems I'd need to
> be keeping at least an index of all nodes of a given "type" all along,
> so I could at least iterate over those?'
>

The main means of searching the graph structure inside a neo4j database is
by traversing it. Basically, you write a description for how to travel the
graph and what data to return, and then you get a list of nodes, a list of
relationships or a list of paths back, depending on what you asked for.

The indexes are currently mainly used for simple lookups and for finding
starting points for traversals.

See http://components.neo4j.org/neo4j-server/milestone/rest.html#Traverse


>
> I think most of the underl

[Neo4j] REST API thoughts/questions/feedback

2011-04-18 Thread Michael DeHaan
Hi all.

I've been working recently on writing a Perl binding for the Neo4j
REST API and thought I'd share some observations, and hopefully can
get a few suggestions on some things.  You can see some of the Perl
work in progress here -- https://github.com/mpdehaan/Elevator (search
for *Neo*.pm).  Basically it's a data later that allows objects to be
plugged between Sql, NoSql (Riak and Mongo so far) and Neo4j.

The idea is we can build data classes and just call "commit()" and the
like on them, though if the class is backed by Neo4j obviously we'll
be able to add links between them, query the links, and so forth.  I'm
still working on that.

Basically the REST API *is* working for me, but here are my
observations about it:

(1)  I'd like to be able to be able to specify the node ID of a node
before I create it.  I like having a primary key, as I can do with
things like Mongo and Riak.  If I do not have a primary key, I have to
search before I add, "upsert" becomes difficult, as do deletions, and
I have to worry about which copy of a given object is authorative.
I understand this can't work for everyone but seems like it would be
useful. If that can be done now, I'd love info on how to!

(2)  I'd like a delete_all find of API to be able to delete all nodes
matching a particular criteria versus having to do a search.   For
instance, I may need to rebuild the database, or part of it, and it's
not clear on how to drop it.   Also, is there a way to drop the entire
database via REST?

(3)  I'd like to be able to have the "key" of the node automatically
added to the index without having to make a second call.  Ideally I'd
like to be able to configure the server to auto-index certain fields,
which is something some of the NoSQL/search tools offer. Similarly,
when updating the node, the index should auto update without an
explicit call to the indexer.

(4)  The capability to do an "upsert" would be very useful, create a
node if it exists for the given key, if not, update it.

(5)   It seems the indexes are the only means of search?   If I need
to search on a field that isn't indexed (say in production, I need to
add a new index), how do I go about adding it for all the nodes that
need to be added to the index *to* that index?   It seems I'd need to
be keeping at least an index of all nodes of a given "type" all along,
so I could at least iterate over those?

I think most of the underlying questions/problems I have are that I'm
trying to make sure graph elements are unique for some criteria, and
this requires that I make more API calls than normal, and have to
implement this in my library and not in the server -- which could be
fragile and certaintly isn't atomic.

I've also noticed some minor things, which were slight stumbling blocks:

 * It seems that while the application says it takes JSON, it will
actually accept things as a key/value pair form submission, and may
prefer it that way.  This could be my code though and I need to debug
this further.
 * At one point in the API docs it suggests POSTing a hash as { key,
value }.  In JSON, this should be { key : value }.
 * Some API documentation online refers to the default port being
 and didn't mention the "/db/..." prefix to the URLs.
 * While I understand "proper" REST is politically correct, I'd be
really happy with simple endpoints that always could take POSTs, or
the ability to always do a post.   Makes calling code simpler.
 * In the documentation, it was unclear whether "my_nodes" really
needed to be "my_nodes" or was some sort of namespace that I could or
should use.   Is there a way to keep graphs in different namespaces?

In all, it's actually looking pretty good, though knowing what this
object key is in advance, and having a way to avoid duplicate objects
would help tremendously.   I like the idea that the URLs come back
when adding objects, in particular, as it helps make the REST APIs to
call about a particular node more self documenting.

I'd be happy to try to explain further if any of that didn't make
sense -- particularly I'd be very interested in how to specify a "key"
for an element in advance, so I didn't have to rely on lookups each
time I need the node ID.   Since the lookup can return a list, it
doesn't guarantee I can get back a specific node.

Thanks!

--Michael DeHaan
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user