Robert --
First off: great and relevant feedback. So thank you for that. Let's
see what we can address. Comments inline.
On Tue, Dec 2, 2008 at 8:41 PM, Emil Eifrem [EMAIL PROTECTED] wrote:
Hi all,
Robert Rees gave Neo4j a shot last week, trying to drive it from
Groovy but came away disappointed:
http://twitter.com/rrees/statuses/1022507656
I asked why:
http://twitter.com/emileifrem/status/1023049302
And here is Robert's excellent reply. I asked permission to repost on
the list to get everyone's input and kept Robert in cc since he's not
subscribed.
See mail below.
-EE
-- Forwarded message --
From: Robert Rees [EMAIL PROTECTED]
Date: Fri, Nov 28, 2008 at 4:04 PM
Subject: Starting pain with Neo4j
To: [EMAIL PROTECTED]
Hi Emil,
This is a follow up to the Tweets we've exchanged. I might also blog
about this at some point, for my own reference.
So what was the pain with getting started with Neo4j?
Well first I found I couldn't run it without the Shell and JTA jars. I
wouldn't mind but the documentation implied that if you just want to
embed a graph db and go you only need the main jar.
Hmm, yea that sucks.
So, the 'neo' component has a compile-time dependency on the 'shell'
component, but only a soft runtime dependency. I.e. it detects the
presence of the shell jar on the classpath when enableRemoteShell() is
invoked and only start up the shell server if it's available. If not,
it'll print an error message. So shell shouldn't have caused you any
trouble, unless of course you wanted to use it, at which point of
course you need the jar.
We do have a 'hard' dependency on JTA. I've toyed with the idea to
replace this dependency with a soft dep, but at the end of the day I
think it introduces more complexity than it's worth. It's probably
better to solve that through an assembly as discussed below.
Anyway, I fixed the line on neo4j.org that said a single 500k jar.
(Now says that it's a single jar w/ one dependency.)
I think in terms
of distributables if Neo4J requires these things then it is worth
having a complete jar for when you do want to just get going and
play around with it.
Yea, we should definitely make an aggregate assembly of some sort that
includes the most common artifacts like neo, shell, index-util and so
on.
Anders: you suggested this last week, did you create a ticket for it?
The EmbeddedNeo name also caused me a lot of confusion. I read that as
being in-memory in the style of Hsql or Derby. I didn't understand
that the path I was passing had to be a real directory rather than a
virtual or conceptual path. I then felt that if the process was going
to create a lot of files and be quite fussy as to whether it was
explicitly asked to shutdown then it would actually be easier to have
it run as a server process.
I think this potential name confusion is something we'll have to live
with. Embedded really means in-process and hopefully most people
won't associate it with in-memory.
The Abstract Server classes and the whole
process felt like I had all the process of a big server architecture
with all the micromanagement involved but none of the power of being
able to connect and share multiple clients.
Hmm, I'm not sure exactly what you're refering to. I believe Mattias'
shell work has an AbstractServer (?), but that's an internal
implementation class. We're currently not a standalone server but are
doing work in that area (mainly via the RemoteNeo project).
(As a side note: I believe running standalone database servers is an
architectural bug, a boiling frog situation we've got stuck in because
of the 'best practices' and inertia of past paradigms when it was for
some reason deemed ok to expose your persistence layer to everyone and
their mom and use the database as the de-facto integration bus. That's
not the way to roll it in a world of services, IMHO, where you should
expose a domain abstraction rather than your underlying representation
on the wire. In that world, having an embedded database is the only
thing that makes sense. /rant
Having said that, I realize that a lot of people still expect a
standalone server and it IS very convenient in some situations. So
it's still something we should and will do. But that's the reason why
we didn't start out with that.)
The error messages you get when you fire up an Embedded datastore on
a directory that is either locked or non-existent didn't feel that
intuitive. The locked message says something like cannot create
neoidb or something similar rather than informing that the store was
already in use.
Hmm, the exception message is:
throw new IllegalStateException( Unable to lock store [
+ storageFileName + ], this is usually a result of some
+ other Neo running using the same store. );
where storageFileName will be the file we weren't able to lock. (The
file name is printed for debugging purposes so we can figure