Re: [Neo] Starting pain with Neo4j

2008-12-03 Thread Peter Haensgen
Hi all,

since I don't agree with some of the comments from Robert I just want to
give my feedback as well.


  Well first I found I couldn't run it without the Shell and JTA jars.
I
  wouldn't mind but the documentation implied that if you just want to
  embed a graph db and go you only need the main jar.

I think it is good to keep things separated. There is one jar for Neo
itself, and another jar for the JTA stuff. The JTA API has a different
lifecycle than Neo and is not under control of the Neo team, so it
should be kept separated. The shell you don't need for runtime, so there
are only 2 jars, which is absolutely acceptable.


  The EmbeddedNeo name also caused me a lot of confusion. I read that
as
  being in-memory in the style of Hsql or Derby. I didn't understand
  that the path I was passing had to be a real directory rather than a
  virtual or conceptual path. I then felt that if the process was
going
  to create a lot of files and be quite fussy as to whether it was
  explicitly asked to shutdown then it would actually be easier to
have
  it run as a server process.

??
My perception at the first contact with Neo was that it is a database,
which is supposed to make things _persistent_. So for me it seemed quite
logical to specify some directory where to put the database files.
Embedded typically means that the lifetime of the database engine is
bound to the lifetime of my Java process, and this is what you get with
the EmbeddedNeo.

 
  Once everything was running I had two issues. The first was having
to
  open a transaction just to read data seemed wrong. I see pure read
  data as being one of the most common tasks and if I am not going to
be
  changing anything I don't see why I have to manage a transaction.

Agree. There should be a shared (read-only) view on the data which does
not require to have a transaction, and isolated transactional views for
manipulating data.

 
  Secondly the property setting felt quite cumbersome, I would expect
to
  be able to set multiple Properties via a MapString, type for
  example. I also think it should be part of the Core API to retrieve
  Nodes by Property although if I am reading the documentation
correctly
  I think that might already be on your roadmap.

I have never missed this one. I think typically the Neo API will not be
used raw, but will be wrapped by a domain model, where the individual
member accessors will translate into node properties and relationships,
like:
MyDomainClass {
Node node;
setName(String name) {node.setProperty(name, name);}}

 
  Again it might be just to do with the name but I would not expect to
  have to explicitly shutdown an Embedded process. The shutdown should
  be on the finalizer for the server. Once an Embedded object goes out
  of scope it is, to my mind, not in use any longer.

Java finalization is evil, as it does not have a predictable behaviour
and in extreme cases could even lead to out-of-memory situations (if the
Finalizer thread works slower than other threads that create new
objects). I would not rely on some finalize method to shutdown the Neo
engine. Registering a shutdown listener at the VM is a better way. This
could be done by Neo by default (close the service if it is still
active), but there should still be a public shutdown method for
explicitely stopping Neo by the application. There are scenarios where
Neo must be closed while the VM still remains alive.


Regards,
Peter

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Starting pain with Neo4j

2008-12-02 Thread Emil Eifrem
Robert --

First off: great and relevant feedback. So thank you for that. Let's
see what we can address. Comments inline.

On Tue, Dec 2, 2008 at 8:41 PM, Emil Eifrem [EMAIL PROTECTED] wrote:
 Hi all,

 Robert Rees gave Neo4j a shot last week, trying to drive it from
 Groovy but came away disappointed:

   http://twitter.com/rrees/statuses/1022507656

 I asked why:

   http://twitter.com/emileifrem/status/1023049302

 And here is Robert's excellent reply. I asked permission to repost on
 the list to get everyone's input and kept Robert in cc since he's not
 subscribed.

 See mail below.

 -EE


 -- Forwarded message --
 From: Robert Rees [EMAIL PROTECTED]
 Date: Fri, Nov 28, 2008 at 4:04 PM
 Subject: Starting pain with Neo4j
 To: [EMAIL PROTECTED]


 Hi Emil,

 This is a follow up to the Tweets we've exchanged. I might also blog
 about this at some point, for my own reference.

 So what was the pain with getting started with Neo4j?

 Well first I found I couldn't run it without the Shell and JTA jars. I
 wouldn't mind but the documentation implied that if you just want to
 embed a graph db and go you only need the main jar.

Hmm, yea that sucks.

So, the 'neo' component has a compile-time dependency on the 'shell'
component, but only a soft runtime dependency. I.e. it detects the
presence of the shell jar on the classpath when enableRemoteShell() is
invoked and only start up the shell server if it's available. If not,
it'll print an error message. So shell shouldn't have caused you any
trouble, unless of course you wanted to use it, at which point of
course you need the jar.

We do have a 'hard' dependency on JTA. I've toyed with the idea to
replace this dependency with a soft dep, but at the end of the day I
think it introduces more complexity than it's worth. It's probably
better to solve that through an assembly as discussed below.

Anyway, I fixed the line on neo4j.org that said a single 500k jar.
(Now says that it's a single jar w/ one dependency.)

 I think in terms
 of distributables if Neo4J requires these things then it is worth
 having a complete jar for when you do want to just get going and
 play around with it.

Yea, we should definitely make an aggregate assembly of some sort that
includes the most common artifacts like neo, shell, index-util and so
on.

Anders: you suggested this last week, did you create a ticket for it?


 The EmbeddedNeo name also caused me a lot of confusion. I read that as
 being in-memory in the style of Hsql or Derby. I didn't understand
 that the path I was passing had to be a real directory rather than a
 virtual or conceptual path. I then felt that if the process was going
 to create a lot of files and be quite fussy as to whether it was
 explicitly asked to shutdown then it would actually be easier to have
 it run as a server process.

I think this potential name confusion is something we'll have to live
with. Embedded really means in-process and hopefully most people
won't associate it with in-memory.

 The Abstract Server classes and the whole
 process felt like I had all the process of a big server architecture
 with all the micromanagement involved but none of the power of being
 able to connect and share multiple clients.

Hmm, I'm not sure exactly what you're refering to. I believe Mattias'
shell work has an AbstractServer (?), but that's an internal
implementation class. We're currently not a standalone server but are
doing work in that area (mainly via the RemoteNeo project).

(As a side note: I believe running standalone database servers is an
architectural bug, a boiling frog situation we've got stuck in because
of the 'best practices' and inertia of past paradigms when it was for
some reason deemed ok to expose your persistence layer to everyone and
their mom and use the database as the de-facto integration bus. That's
not the way to roll it in a world of services, IMHO, where you should
expose a domain abstraction rather than your underlying representation
on the wire. In that world, having an embedded database is the only
thing that makes sense. /rant

Having said that, I realize that a lot of people still expect a
standalone server and it IS very convenient in some situations. So
it's still something we should and will do. But that's the reason why
we didn't start out with that.)


 The error messages you get when you fire up an Embedded datastore on
 a directory that is either locked or non-existent didn't feel that
 intuitive. The locked message says something like cannot create
 neoidb or something similar rather than informing that the store was
 already in use.

Hmm, the exception message is:

   throw new IllegalStateException( Unable to lock store [
   + storageFileName + ], this is usually a result of some 
   + other Neo running using the same store. );

where storageFileName will be the file we weren't able to lock. (The
file name is printed for debugging purposes so we can figure 

Re: [Neo] Starting pain with Neo4j

2008-12-02 Thread Anders Nawroth
Emil Eifrem:
 Yea, we should definitely make an aggregate assembly of some sort that
 includes the most common artifacts like neo, shell, index-util and so
 on.

 Anders: you suggested this last week, did you create a ticket for it?
   

Now!

https://trac.neo4j.org/ticket/135

What about the name: neo-base? neo-common? ...

I'm not sure about redistributing JTA. See license:
https://olex.openlogic.com/licenses/108

The maven assembly plugin should be able to perform the aggregation.

When I know what should go into the aggregate, I'll set it up.

/anders



-- 
Anders Nawroth [EMAIL PROTECTED]
GTalk, Skype: anders.nawroth
Phone: +46 737 894 163
http://twitter.com/nawroth
http://blog.nawroth.com/

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user