[Neo4j] Start script of 1.4.M05/1.4.M06 fails with older versions of bash + FIX

2011-07-06 Thread Stephan Hagemann
Hi everyone,

a rather old linux installation on our build server led us to find out that
the new start script introduced in M05 (?) does not work with all versions
of bash.

We got:
cruise:/virtual/hudson/hudson_home/jobs/graphdb/workspace#
/opt/neo4j/bin/neo4j start
/opt/neo4j/bin/neo4j: line 37: syntax error in conditional expression:
unexpected token `('
/opt/neo4j/bin/neo4j: line 37: syntax error near `^(['
/opt/neo4j/bin/neo4j: line 37: `  if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$
]]; then'

On a system with this info:
ext-xecruise52-1:/opt/neo4j/bin# cat /proc/version
Linux version 2.6.28.7-ibm-x3650 (root@obc-fai42-1) (gcc version 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Thu Feb 26 13:50:31 CET 2009

Here is the quick fix I just found (no patch, since I don't want to suggest
I know it works on other system versions...).
Enclose the regexps on line 37ff in quotes as so:

  if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then
key=`echo ${BASH_REMATCH[1]} | sed 's/\./_/g'`
value=${BASH_REMATCH[2]}
if [[ ${key} =~ ^(.*)_([0-9]+)$ ]]; then


Cheers,
Stephan
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Starting neo4j Server doesn't return to promt

2011-04-18 Thread Stephan Hagemann
Hello group,

I just realized that since upgrading to Neo4j 1.3 my deployment is broken.
It seems to be due to the fact that when starting up, the server does not
return to a prompt (I noticed this locally also - I need to press enter to
get the prompt). Vlad (the deployment script) thus probably assumes that the
startup is not yet finished. I have played with the startup options in the
neo4j executable, but to no avail. Is anyone else experiencing this or has
some ideas?

Thanks!
Stephan
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Any method to count nodes and relationships in a traversal framework

2011-04-13 Thread Stephan Hagemann
Hi Gunda,

I believe you are asking fir the same thing I asked for a couple of days
ago. Check this thread:

http://lists.neo4j.org/pipermail/user/2011-April/007932.html

As the discussion shows, this feature is currently not available but
probably interesting in a lot of settings. At least for you and me ;)

Regards,
Stephan


On Wed, Apr 13, 2011 at 11:23, bhargav gunda bhargav@gmail.com wrote:

 Respected,

 I would to know that is there any method to count the number of nodes and
 relationships traversed by the traversal framework.
 Like path.length() --- which gives the depth of the tree. So as like that
 method is there any direct one?

 For example,

 If a tree consist of several nodes and several relations. And I want to
 traverse from a particular node to the end of the tree and the result has
 only contains certain period of nodes like if the tree traverse 1000 nodes
 and 1 relationships from a particular node and I want only the result
 which contains from node 35 to node 70.
 So is there any direct method to get the count list of nodes and
 relationships.

 Thanks  Regards,
 G.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Number of nodes/relationships visited in query?

2011-04-11 Thread Stephan Hagemann
Hi Tobias,

yes!

Since computation isn't performed until actually requested (when the
 iterator is iterated over), and since the Iterable could give a different
 result when you iterate over it subsequent times (due to the graph being
 modified), the Iterator object is the only object where I could see that
 such information could be added usefully. This does mean that you cannot
 use
 the java foreach loop with such an Iterable AND get the visited count, but
 would have to resort to using the hasNext() and next() methods. We could
 quite easily add some convenience methods for making that easier though,
 something like:

 CountIteratorPath pathIter = shortestPath.findAllPaths( start, end
 ).iterator();
 for ( Path path : IterUtil.loop( pathIter ) ) {
doSomethingWith(path);
 }
 // after the loop is done, number of nodes visited so far is the total
 number of visited nodes.
 int visitedNodes = pathIter.numberOfNodesVisitedSoFar();


This is exactly what I thought I would like to do! It would be fine for
numberOfNodesVisitedSoFar to hold the number for the nodes from the last run
of the iterator. I agree that it would be helpful to output it through the
REST API that way you could compare the efficiency and load of queries (I
will send you another email about exactly that in a bit...).


 WDYT? If this is useful we could add this kind of statistics to all types
 of
 traversals. Exposing that through the REST interface would be even simpler,
 the implementation would simply do the equivalent of the iteration above
 then add the statistics to the result. For paginated results (when those
 are
 added) we could have the statistics reflect the number of nodes visited for
 creating that page of data, for algorithms that find the easiest
 solutions
 first, you could use those statistics to stop a search when the number of
 nodes visited to collect a page grows too big.


I agree. Again, I believe, this would be genuinely helpful in a lot of
situations.

cu
Stephan
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Smarter expander needed?

2011-04-11 Thread Stephan Hagemann
Hi all,

the reason I asked the question about counting the number of visited nodes
earlier is that we are running into performance issues when working with
different expanders.

Our graph contains *user* and *company* nodes. There are a lot more users
than companies. Users are connected through *contact* relationships, users
are connected to companies as *employees*, companies aren't connected to
each other directly. For paths among users we only want to traverse contact
edges. For paths from users to companies we traverse user edges and one
employee edge at the end (to get to a company).

We are using Neo's shortest path algorithm to find connections between users
and companies. The path requirements from above can be formalized into these
*expanders*:

MapRelationshipType, Direction userToCompanyRelations = new
HashMapRelationshipType, Direction();
userToCompanyRelations.put(Relationship.contact, Direction.BOTH);
userToCompanyRelations.put(Relationship.employee, Direction.OUTGOING);
USER_TO_COMPANY = new DirectedExpander(userToCompanyRelations);

MapRelationshipType, Direction userToUserRelations = new
HashMapRelationshipType, Direction();
userToUserRelations.put(Relationship.contact, Direction.BOTH);
USER_TO_USER = new DirectedExpander(userToUserRelations);

For the *query* we essentially do

GraphAlgoFactory.shortestPath(expander, 5).findAllPaths(fromNode,
toNode);

As you can see from the attached screenshot: we are doing it wrong. The path
from a user to a company is almost 14 times slower than the path to another
user! And it gets worse when more types of edges are added. The result for
the user-company path is acceptably slow.

What's going on?
If I could output the number of visited nodes for a query, I could tell you
exactly... Here is my intuition: the expanders only specify which edges can
be traversed. In the user-user path query this is ok: all the fully expanded
paths lead from one user to another user (so could technically be the path
we're looking for). In the user-company case most fully expanded paths will
lead from user to user also, for the query they will thus be unusable! We
could do better in the latter case if we had the possibility of specifying
in the expander that (a) there should be no attempt to expand after having
just passed over an employee edge (since now we are at a company and can't
find any more potential results on this path) and (b) there should be no
attempt to expand along a contact edge in step five (because that will only
lead to a user).

Did I just miss how I can put this information into the ShortestPath
algorithm or do I have to go one level deeper and implement a specific
ShortestPath algorithm for each of these queries? Does anyone else face this
problem? Anyone seeing similar kinds of queries?

Thanks
Stephan
attachment: Screen shot 2011-04-11 at 15.41.57.png___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Smarter expander needed?

2011-04-11 Thread Stephan Hagemann
Thanks for your ideas, Peter and Mattias!

We will work on them and hopefully have some results we can post back here
soon.


On Mon, Apr 11, 2011 at 21:18, Mattias Persson matt...@neotechnology.comwrote:

 2011/4/11 Peter Neubauer peter.neuba...@neotechnology.com:
  Mmh,
  you might be right in that the ShortestPath is not taking that much
  context info into account. In that case, I guess you should hack it to
  be even smarter about how to expand things. Right now, if you look at
 
 https://github.com/neo4j/graphdb/blob/master/graph-algo/src/main/java/org/neo4j/graphalgo/impl/path/ShortestPath.java#L267
 ,
  the relationship expander is only getting a node as the context to
  decide what relationships to return.
 
  This probably could be changed to include a path as the context, or
  you fork ShortestPath and make it smarter for your case, or implement
  your own RelationshipExpander that is a bit smarter?

 Yes, I think two things would need to be done here (as you also noticed
 Peter):

 1) Have RelationshipExpander have its expand method accept a Path
 instead of a Node.
 2) Implement your own RelationshipExpander which can make smart
 decisions based on those Paths it gets as input.

 Having everything centered around Paths instead of Nodes/Relationships
 is good for just about everything, and I think all aspects of
 traversal will be geared towards that in a near future.

 To solve your problem now you might need a way to differentiate your
 user/company nodes so that you can immediately tell if a Node is a
 user or company, like a property or single relationship to a reference
 node or similar. If there is then you can still implement your own
 RelationshipExpander and have that look at each node and make decision
 based on that. If not then you might need to roll your own
 implementation, based on f.ex. ShortestPath.

 Any other suggestions, anyone?

 
  Cheers,
 
  /peter neubauer
 
  GTalk:  neubauer.peter
  Skype   peter.neubauer
  Phone   +46 704 106975
  LinkedIn   http://www.linkedin.com/in/neubauer
  Twitter  http://twitter.com/peterneubauer
 
  http://www.neo4j.org   - Your high performance graph
 database.
  http://startupbootcamp.org/- Ă–resund - Innovation happens HERE.
  http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
 
 
 
  On Mon, Apr 11, 2011 at 4:14 PM, Stephan Hagemann
  stephan.hagem...@googlemail.com wrote:
  Hi all,
 
  the reason I asked the question about counting the number of visited
 nodes
  earlier is that we are running into performance issues when working with
  different expanders.
 
  Our graph contains *user* and *company* nodes. There are a lot more
 users
  than companies. Users are connected through *contact* relationships,
 users
  are connected to companies as *employees*, companies aren't connected to
  each other directly. For paths among users we only want to traverse
 contact
  edges. For paths from users to companies we traverse user edges and one
  employee edge at the end (to get to a company).
 
  We are using Neo's shortest path algorithm to find connections between
 users
  and companies. The path requirements from above can be formalized into
 these
  *expanders*:
 
 MapRelationshipType, Direction userToCompanyRelations = new
  HashMapRelationshipType, Direction();
 userToCompanyRelations.put(Relationship.contact, Direction.BOTH);
 userToCompanyRelations.put(Relationship.employee,
 Direction.OUTGOING);
 USER_TO_COMPANY = new DirectedExpander(userToCompanyRelations);
 
 MapRelationshipType, Direction userToUserRelations = new
  HashMapRelationshipType, Direction();
 userToUserRelations.put(Relationship.contact, Direction.BOTH);
 USER_TO_USER = new DirectedExpander(userToUserRelations);
 
  For the *query* we essentially do
 
 GraphAlgoFactory.shortestPath(expander, 5).findAllPaths(fromNode,
  toNode);
 
  As you can see from the attached screenshot: we are doing it wrong. The
 path
  from a user to a company is almost 14 times slower than the path to
 another
  user! And it gets worse when more types of edges are added. The result
 for
  the user-company path is acceptably slow.
 
  What's going on?
  If I could output the number of visited nodes for a query, I could tell
 you
  exactly... Here is my intuition: the expanders only specify which edges
 can
  be traversed. In the user-user path query this is ok: all the fully
 expanded
  paths lead from one user to another user (so could technically be the
 path
  we're looking for). In the user-company case most fully expanded paths
 will
  lead from user to user also, for the query they will thus be unusable!
 We
  could do better in the latter case if we had the possibility of
 specifying
  in the expander that (a) there should be no attempt to expand after
 having
  just passed over an employee edge (since now we are at a company and
 can't
  find any more potential results on this path) and (b) there should

[Neo4j] Number of nodes/relationships visited in query?

2011-04-10 Thread Stephan Hagemann
Hi all,

is there a way for me to get to the number of nodes or relationships that
the graph algorithms (like ShortestPath) have visited and output it with the
result of the query? I would like to do that to analyze the performance and
load of query processing. Any ideas?

Thanks!

Stephan
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Constructing an evaluator that only takes specific nodes from a path

2011-04-07 Thread Stephan Hagemann
Hi guys,

Dario and I are working together on this, so let me clarify, what we want to
achieve. An example query in a friend network would be:

Retrieve a set of people P that are the direct friends of person A. P should
include only those friends that are also on a path between A and another
user B.

We know how to find paths, but we fail at returning nodes - let alone sets
of nodes.

The old ReturnableEvaluator seemed to achieve just that: A client hook for
evaluating whether a specific node should be returned from a traverser.,
but that is deprecated in the current milestone release. We're unable to
find the equivalent functionality with the new Traversal framework.

Thanks
Stephan



On Thu, Apr 7, 2011 at 09:35, Mattias Persson matt...@neotechnology.comwrote:

 Sory, I meant

 INCLUDE_AND_PRUNE
the path will be included in the result set, but the traversal
won't go further down that path, but will continue down other paths
   that haven't been pruned



 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Constructing an evaluator that only takes specific nodes from a path

2011-04-07 Thread Stephan Hagemann
If they are indeed equivalent, Michael is right - then I was confused by the
doc talking about nodes vs the other talking about paths.


On Thu, Apr 7, 2011 at 10:43, Michael Hunger 
michael.hun...@neotechnology.com wrote:

 I think the confusing thing here is that ReturnableEvaluator talked about
 including/excluding nodes
 whereas when describing the Evaluations you spoke about including/excluding
 paths.

 Which of those is correct ?

 Cheers

 Michael

 Am 07.04.2011 um 10:40 schrieb Mattias Persson:

  2011/4/7 Stephan Hagemann stephan.hagem...@googlemail.com:
  Hi guys,
 
  Dario and I are working together on this, so let me clarify, what we
 want to
  achieve. An example query in a friend network would be:
 
  Retrieve a set of people P that are the direct friends of person A. P
 should
  include only those friends that are also on a path between A and another
  user B.
 
  We know how to find paths, but we fail at returning nodes - let alone
 sets
  of nodes.
 
  The old ReturnableEvaluator seemed to achieve just that: A client hook
 for
  evaluating whether a specific node should be returned from a
 traverser.,
  but that is deprecated in the current milestone release. We're unable to
  find the equivalent functionality with the new Traversal framework.
 
  ReturnableEvaluator is like controlling the INCLUDE/EXCLUDE part of an
  evaluation
  StopEvaluator is like controlling the CONTINUE/PRUNE part of an
 evaluation
 
  The @Deprecated TraversalDescription#prune and #filter are also a
  direct mapping of StopEvaluator and ReturnableEvaluator respectively.
  Evaluator replaces those and combines them into one concept where you
  can express the same semantics.
 
 
  Thanks
  Stephan
 
 
 
  On Thu, Apr 7, 2011 at 09:35, Mattias Persson 
 matt...@neotechnology.comwrote:
 
  Sory, I meant
 
  INCLUDE_AND_PRUNE
 the path will be included in the result set, but the traversal
 won't go further down that path, but will continue down other paths
that haven't been pruned
 
 
 
  --
  Mattias Persson, [matt...@neotechnology.com]
  Hacker, Neo Technology
  www.neotechnology.com
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 
 
 
  --
  Mattias Persson, [matt...@neotechnology.com]
  Hacker, Neo Technology
  www.neotechnology.com
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Constructing an evaluator that only takes specific nodes from a path

2011-04-07 Thread Stephan Hagemann
Thanks, that will help! Iwill try defining my own uniqueness criteria.


  Oh, so if any node in the path has been returned in any other path
 before if (except the start node) then exclude it? That's the first
 time I've heard that requirement. Love the fact that you sent a
 picture, guys :)


Handcrafted - that's how we do it ;)

Our requirement is a little bit different than you state it:

if any node in the path _that will be returned_ has been returned in any
other path
before then exclude it.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user