Re: [Neo4j] Server Plugin Example to help with large queries over REST API

Todd Chaffee Tue, 10 May 2011 10:44:28 -0700

Hi Jake,

I apologize for taking so long to get back to you. I'm happy with all your
answers and it's good to hear paging solutions and index features are both
on the horizon.  Fair enough to give both topics the full analysis they
deserve and to come up with a good solution.


I think the only open point is your question about the vendor specific MIME
types, which I hope could help make the job of the REST API designers a lot
easier.  I'll try to give a concrete example.

Today the list of documents returned from /index/node each have the
following format (let's call it "assumed" initial version):

"my-node-index" : {
    "template" : "
http://localhost:7474/db/data/index/node/my-node-index/{key}/{value}";,
    "_blueprints:type" : "MANUAL",
    "provider" : "lucene",
    "type" : "fulltext"
}

As long as the client can only accept application/json it puts a lot of
pressure on the REST API team to make excellent decisions that
will fulfil all future needs by only ever *adding* new elements to this doc.

We could add something like this (assumed version 2.0, note paging and limit
params):

"my-node-index" : {
    "template" : "
http://localhost:7474/db/data/index/node/my-node-index/{key}/{value}";,
    "_blueprints:type" : "MANUAL",
    "provider" : "lucene",
    "type" : "fulltext"
    "provider-specific-resources" : {
        "query-template" : "
http://localhost:7474/db/data/index/my-node-index/query?query={query}&limit={integer}&page={integer}&pagesize={integer}
"
    }
}

But when and if a stable URI format is decided for queries across *all*
index providers and we want to promote the query template URI to the common
area like the following, we break a lot of clients.  (Note also the
'param-we-never-thought-of' which you don't find in the example above).

"my-node-index" : {
    "template" : "
http://localhost:7474/db/data/index/node/2062972744/{key}/{value}";,
    "query-template" : "
http://localhost:7474/db/data/index/my-node-index?query={query}&param-we-never-thought-of={value}&limit={integer}&page={integer}&pagesize={integer}
"
    "_blueprints:type" : "MANUAL",
    "provider" : "lucene",
    "type" : "fulltext"
    "provider-specific-resources" : {
        "some-new-resource" : "
http://localhost:7474/db/data/index/some-uri/goes/here
    }
}

If instead of specifying application/json in each case above, the client
specified

Accept:application/vnd.neo4j.graphdb+json, then
Accept:application/vnd.neo4j.graphdb-v2+json, and finally
Accept:application/vnd.neo4j.graphdb-v3+json

clients would not break and you could make radical changes to the format of
the returned document to reflect best practice as it evolves.

Clients are no longer saying "I want *something* back in json, please never
introduce breaking changes", they are saying "I want a document in the exact
format as specified in this vendor's version X of the API, and yep, in
json".

The big advantage I see is that you no longer have to get it perfect.  You
can add features in V1 in an ugly way and when you better understand the
problem domain you can fix it in V2.  This, along with URI discovery on the
client's part, looks like it could make for *very* resilient clients.  URIs
can change even in the same version and if the client is discovering URIs
starting at root, nothing breaks.  Return format of the json doc is
guaranteed so the client won't break there either.

Disclaimer:  I haven't tried this yet in real life.  I'm also not a REST
expert and might be missing something.  But I also have a lot of years of
development and analysis under my belt and this solution passes the initial
sniff test for me :-)  I hope it's worth considering on your side.

Todd


> Message: 10
> Date: Wed, 4 May 2011 21:11:09 +0200
> From: Jacob Hansson <ja...@voltvoodoo.com>
> Subject: Re: [Neo4j] Server Plugin Example to help with large queries
>        over REST API
> To: Neo4j user discussions <user@lists.neo4j.org>
> Message-ID: <BANLkTi=vckgcs1sfudlp2xxbvlwgkks...@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Tue, May 3, 2011 at 12:06 AM, Todd Chaffee <t...@mikamai.com> wrote:
>
> > Hi Jake,
> >
> > The short answer to "should we?" is right in the neo4j REST API
> > documentation: "The query syntax used here depends on what index provider
> > you chose when you created the index."
> >
>
> That's a really good point. You'd think I would have realized that our API
> is already provider-specific since I was the one who wrote that very
> statement in the docs. I think I was wearing my "inside the server" hat
> (where the implementation is provider-agnostic, we just pipe that query
> string through), instead of my "REST API user" hat while writing my
> previous
> email.
>
>
> >
> > Since the provider is "discoverable" via the REST API, the client app can
> > decide if it speaks that provider's query language or indicate an
> > "unsupported" error as the case may be.  No reason for the REST API to
> not
> > provide "discoverable" URIs or parameters to all the capabilities of each
> > provider.
> >
>
> If we decide to continue down the path of different APIs for different
> index-providers, then I absolutely agree. My concern is that by going that
> path, we push of the problem to the server clients, based on the assumption
> that they will be the ones adding indexes other than Lucene, and so they
> should know how to use them.
>
> I'm not sure that is necessarily true. There is currently work going on
> trying out alternatives to Lucene. If other index implementations emerge,
> it
> would be super-awesome if server clients could "just swap" providers and
> see
> speed improvements.
>
> That would mean providing a uniform way to do index queries.
>
> Perhaps a combination of both is possible - provide a fixed set of index
> functions that should be available from all providers, but also let
> providers extend the API to add extra features.
>
>
> >
> > The longer version:
> >
> > Concerning "We really don't want the API to be Lucene-specific (which is
> > why
> > we're currently only allowing that one query string as input)",
> >
> > I wonder if this a case of not providing "what would work right now" just
> > for maintaining "future flexibility" for a requirement that doesn't yet
> > exist?
> >
> > REST paging and better access to the full Lucene API are open requests
> from
> > some of the community.  Are there any customers / community members
> > currently asking for a pluggable alternative to Lucene as of today?  It's
> > not a rhetorical question because maybe I'm just not aware of any such
> > requests.
> >
>
> "Opening up" the index API to reveal more of the underlying Lucene features
> would indeed allow us to do both sorting and paging on index searches.
>
> There are two things that still leaves me hesitant to investing more time
> into solving this problem with this seemingly simple solution:
>
> One - There is currently work underway looking over our take on indexing,
> work that will hopefully lead to simplifying index usage a lot. The
> timeline
> for that is rather short, and it is likely that any changes done to the
> index API now will have to be redone within a short time frame.
>
> Two - I would much rather like a solution to paging and sorting that is the
> same for both traversals and indexes. The Lucene solution does solve some
> of
> the pain we've discussed in other threads, but not all of it. Rather than
> investing a bit of time to get a halfway-solution, I'd love to invest a
> little more time to get the full hullaballooza.
>
> It should also be noted that extending the Indexing API in a way where
> different implementations can have different APIs is not a *super* easy
> thing to do, it would take a bit of effort to do it well.
>
>
> >
> > The goal of providing a pluggable index implementation is worthy. Is it
> > maybe too early? Are index provider APIs  so standardized we can drop
> them
> > in as easily as we could for example switch database providers these
> days?
> >  Although the neo4j Index API is an abstraction of the Lucene API, the
> two
> > look tightly coupled to me as of today.  (E.g. neo4j IndexHits is Lucene
> > Hits).
> >
> > The approach of "hiding away" advanced functionality until some standard
> is
> > reached, or until we better understand the problem space, is not optimal.
> >  Not that there aren't good motivations.  We want to create a durable
> REST
> > API with URIs that don't change or break with each release.
> >
> > I have some ideas on how this could be done at the same time as providing
> > full functionality today.
> >
> > Mostly based around 1) using "vendor MIME media types" instead of the
> > generic Accept:application/json
>
>
> I'm not sure what you mean by this, could you elaborate on where we could
> use vendor MIME types, and for what purpose?
>
>
> > and 2) enhancing the REST API so it is fully
> > HATEOAS and "responses from the server will be documents that include
> URIs
> > to everything you can do next".
> >
> > Point 2 also becomes very interesting around paging (think "next" and
> > "previous" URIs as part of the returned document).
> >
> > I can provide more details and concrete examples if this approach sounds
> > interesting.
> >
> > Further reading:
> >
> > http://barelyenough.org/blog/2008/05/versioning-rest-web-services/
> >
> >
> >
> http://barelyenough.org/blog/2007/05/hypermedia-as-the-engine-of-application-state/
> >
> > Todd
> --


MIKAMAI | Making Media Social
http://mikamai.com
+447868260229
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Server Plugin Example to help with large queries over REST API

Reply via email to