Re: [Neo4j] Neo4jPHP

2011-07-07 Thread Eldad Yamin
Thanks!
I'll give it a try soon.

On Thu, Jul 7, 2011 at 1:50 AM, Josh Adell josh.ad...@gmail.com wrote:

 Hey all,

 I've been working on another PHP client for Neo4j.  I think it's ready
 for some real-life testing, and I'm interested to see what you all
 think.
 GitHub: https://github.com/jadell/Neo4jPHP
 Download: https://github.com/jadell/Neo4jPHP/tarball/0.0.1-beta

 Features:
 - Developed against the Neo4j 1.4 milestone releases
 - Simple, object-oriented API
 - Almost complete REST API coverage
 - Indexing of nodes and relationships, including exact match and query
 support
 - Cypher queries (thanks to Jacob Hansson)
 - Traversal support, including paged traversals
 - Lazy-loading of node and relationship data

 Hopefully coming soon:
 - Client-side caching
 - Batch operations

 There are some usage examples included.

 It's a beta release, so please be gentle (on me, that is; be as rough
 as you want with the code.)  If anyone finds any bugs or has feature
 requests, please use the GitHub issues page at
 https://github.com/jadell/Neo4jPHP/issues

 Thanks and enjoy!

 -- Josh Adell
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Spatial - Keep OSM imports

2011-07-07 Thread Peter Neubauer
Robin,

the database is deleted after each run in Neo4jTestCase.java,

   @Override
@After
protected void tearDown() throws Exception {
shutdownDatabase(true);
super.tearDown();
}

if you change to shutdownDatabase(false), the database will not be
deleted. In this case, make sure to run just that test in order not to
write several tests to the same DB for clarity.

mvn test -Dtest=TestDynamicLayers

Does that work for you?


Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Jul 5, 2011 at 6:07 PM, Robin Cura robin.c...@gmail.com wrote:
 Hello,

 First of all, I don't know anything in java, and I'm trying to figure out if
 neo4j could be usefull for my projects. If it is, I will of course learn a
 bit of java so that I can use neo4j in a decent way for my needs.

 I'd like to use a neo4j spatial database together with GeoServer.
 For this, I'm following the tutorial here :
 http://wiki.neo4j.org/content/Neo4j_Spatial_in_GeoServer
 But this paragraph is blocking me :
 

   - One option for the database location is a database created using the
   unit tests in Neo4j Spatial. The rest of this wiki assumes that you ran the
   TestDynamicLayers unit test which loads an OSM dataset for the city of Malmö
   in Sweden, and then creates a number of Dynamic Layers (or views) on this
   data, which we can publish in GeoServer.
   - If you do use the unit test for the sample database, then the location
   of the database will be in the target/var/neo4j-db directory of the Neo4j
   Source code.

 

 My problem is I do not succeed keeping those neo4j spatial databases created
 with the tests : When I run TestDynamicLayers, it builds databases (in
 target/var/neo4j-db), but as soon as the database is successfully loaded, it
 deletes it and start importing another database, and so on.

 My poor understanding of java doesn't help a lot, I tried to edit the .java
 in Netbeans + Maven, but until then, it doesn't work, all the directories
 created during the tests are deleted when the test ends.

 Any idea how I could keep those databases ?

 Thanks,

 Robin
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Indexed relationships

2011-07-07 Thread Niels Hoogeveen

Finished the implementation of indexed relationships. The graph collections 
component now contains the package 
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship,
 containing the IndexedRelationship class.
This class can be used instead of regular relationships when:relationships need 
to be stored in a particular sort ordera unicity constraint needs to be 
guaranteed nodes become densely populated with relationships.
The implementation is traverser friendly. Given a start nodes all end nodes can 
be found by following four relationships types in outgoing direction. Given an 
end node the start node can be found by following these four relationship types 
in incoming direction. Of course this functionality is also covered in the API.
Niels

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 02:36:29 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Pushed SortedTree to Git after adding a unit test and doing some debugging.
 TODO:Add API for indexed relationships using SortedTree as the 
 implementation.Make SortedTree thread safe.
 With regard to the latter issue. I am considering the following solution. 
 Acquire a lock (delete a non existent property) on the node that points to 
 the root of the tree at the start of AddNode, RemoveNode and Delete. No other 
 node in the SortedTree is really stable, even the rootnode may be moved down, 
 turning another node into the new rootnode, while after a couple of remove 
 actions the original rootnode may even be deleted. 
 Locking the node pointing to the rootnode, prevents all other 
 threads/transactions from updating the tree. This may seem restrictive, but a 
 single new entry or a single removal may in fact have impact on much of the 
 tree, due to balancing. More selective locking would require a prebalancing 
 tree walk, determining the affected subtrees, lock them and once every 
 affected subtree is locked, perform the actual balancing. 
 Please let me hear if there are any objections to locking the node pointing 
 to the tree as the a solution to make SortedTree thread safe.
 Niels
 
  Date: Tue, 5 Jul 2011 08:27:57 +0200
  From: neubauer.pe...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Indexed relationships
  
  Great work Nils!
  
  /peter
  
  Sent from my phone.
  On Jul 4, 2011 11:39 PM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:
  
   Made some more changes to the SortedTree implementation. Previously
  SortedTree would throw an exception if a duplicate entry was being added.
   I changed SortedTree to allow a key to point to more than one node, unless
  the SortedTree is created as a unique index, in which case an exception is
  raised when an attempt is made to add a node to an existing key entry.
   A SortedTree once defined as unique can not be changed to a non-unique
  index or vice-versa.
   SortedTrees now have a name, which is stored in the a property of the
  TREE_ROOT relationship and in the KEY_VALUE relationship (a new relationship
  that points from the SortedTree to the Node inserted in the SortedTree). The
  name of a SortedTree can not be changed.
   SortedTrees now store the class of the Comparator, so a SortedTree, once
  created, can not be used with a different Comparator.
   SortedTree is now an Iterable, making it possible to use it in a
  foreach-loop.
   Since there are as of yet, no unit tests for SortedTree, I will create
  those first before pushing my changes to Git. Preliminary results so far are
  good. I integrated the changes in my own application and it seems to work
  fine.
   Todo:
   Decide on an API for indexed relationships. (Community input still
  welcome).Write unit tests.Make SortedTree thread safe (Community help still
  welcome).
   Niels
  
   From: pd_aficion...@hotmail.com
   To: user@lists.neo4j.org
   Date: Mon, 4 Jul 2011 15:49:45 +0200
   Subject: Re: [Neo4j] Indexed relationships
  
  
   I forgot to add another recurrent issue that can be solved with indexed
  relationships: guaranteed unicity constraints.
From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Mon, 4 Jul 2011 01:55:08 +0200
Subject: [Neo4j] Indexed relationships
   
   
In the thread [Neo4j] traversing densely populated nodes we discussed
  the problems arising when large numbers of relationships are added to the
  same node.
Over the weekend, I have worked on a solution for the
  dense-relationship-nodes using SortedTree in the neo-graph-collections
  component. After some minor tweaks to the implementation of SortedTree, I
  have managed to get a workable solution, where two nodes are not directly
  linked by a relationship, but by means of a BTree (entirely stored in the
  graph).
Before continuing this work, I'd like to have a discussion about
  features, since what we have now is not just a solution for the dense
  populated node issue, but is actually a 

Re: [Neo4j] Indexed relationships

2011-07-07 Thread Michael Hunger
Good work,

do you have an example ready (and/or some tests that show how it works/is used) 
?

In creation, manual traversal and automatic traversal (i.e. is there a 
RelationshipExpander that uses it).

And in the constructor if there is no relationship to the treeNode, you create 
a new one, but that new treeNode is not connected to the actual node?

I'm not sure if it should support the original relationship-traversal API / 
methods (getRelationships(Dir,type), etc).

Perhaps that IndexedRelationship should rather be just a wrapper around a 
SuperNode ? So probably rename it to SuperNode(Wrapper) or 
HeavilyConnectedNode(Wrapper) ?)

Cheers

Michael

Am 07.07.2011 um 12:51 schrieb Niels Hoogeveen:

 
 Finished the implementation of indexed relationships. The graph collections 
 component now contains the package 
 https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship,
  containing the IndexedRelationship class.
 This class can be used instead of regular relationships when:relationships 
 need to be stored in a particular sort ordera unicity constraint needs to be 
 guaranteed nodes become densely populated with relationships.
 The implementation is traverser friendly. Given a start nodes all end nodes 
 can be found by following four relationships types in outgoing direction. 
 Given an end node the start node can be found by following these four 
 relationship types in incoming direction. Of course this functionality is 
 also covered in the API.
 Niels
 
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 02:36:29 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Pushed SortedTree to Git after adding a unit test and doing some debugging.
 TODO:Add API for indexed relationships using SortedTree as the 
 implementation.Make SortedTree thread safe.
 With regard to the latter issue. I am considering the following solution. 
 Acquire a lock (delete a non existent property) on the node that points to 
 the root of the tree at the start of AddNode, RemoveNode and Delete. No 
 other node in the SortedTree is really stable, even the rootnode may be 
 moved down, turning another node into the new rootnode, while after a couple 
 of remove actions the original rootnode may even be deleted. 
 Locking the node pointing to the rootnode, prevents all other 
 threads/transactions from updating the tree. This may seem restrictive, but 
 a single new entry or a single removal may in fact have impact on much of 
 the tree, due to balancing. More selective locking would require a 
 prebalancing tree walk, determining the affected subtrees, lock them and 
 once every affected subtree is locked, perform the actual balancing. 
 Please let me hear if there are any objections to locking the node pointing 
 to the tree as the a solution to make SortedTree thread safe.
 Niels
 
 Date: Tue, 5 Jul 2011 08:27:57 +0200
 From: neubauer.pe...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships
 
 Great work Nils!
 
 /peter
 
 Sent from my phone.
 On Jul 4, 2011 11:39 PM, Niels Hoogeveen pd_aficion...@hotmail.com
 wrote:
 
 Made some more changes to the SortedTree implementation. Previously
 SortedTree would throw an exception if a duplicate entry was being added.
 I changed SortedTree to allow a key to point to more than one node, unless
 the SortedTree is created as a unique index, in which case an exception is
 raised when an attempt is made to add a node to an existing key entry.
 A SortedTree once defined as unique can not be changed to a non-unique
 index or vice-versa.
 SortedTrees now have a name, which is stored in the a property of the
 TREE_ROOT relationship and in the KEY_VALUE relationship (a new relationship
 that points from the SortedTree to the Node inserted in the SortedTree). The
 name of a SortedTree can not be changed.
 SortedTrees now store the class of the Comparator, so a SortedTree, once
 created, can not be used with a different Comparator.
 SortedTree is now an Iterable, making it possible to use it in a
 foreach-loop.
 Since there are as of yet, no unit tests for SortedTree, I will create
 those first before pushing my changes to Git. Preliminary results so far are
 good. I integrated the changes in my own application and it seems to work
 fine.
 Todo:
 Decide on an API for indexed relationships. (Community input still
 welcome).Write unit tests.Make SortedTree thread safe (Community help still
 welcome).
 Niels
 
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Mon, 4 Jul 2011 15:49:45 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 I forgot to add another recurrent issue that can be solved with indexed
 relationships: guaranteed unicity constraints.
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Mon, 4 Jul 2011 01:55:08 +0200
 Subject: [Neo4j] Indexed relationships
 
 
 In the thread [Neo4j] traversing densely populated nodes we 

Re: [Neo4j] REST batch support - transaction support for java rest client?

2011-07-07 Thread Patrik Sundberg
Following up on the topic of transactions for client API.

What is the current plan for some sort of client side API supporting
transactions?

I'm playing around with some ideas here and the lack of transaction support
in the client API is problematic. I know there's BATCH support in the REST
API which effectively is a transaction, but it doesn't always suit. For
example I have the following steps that I'd like to accomplish:
- create a reference node
- check if a node with a given domain id exist in an index, if it does, fail
- create an entity node for the given domain id
- add entity node to the index
- attach entity node to ref node
- create a node representing a specific version of the entity node
- attach the version node to the entity node, with some properties on the
relationships signifying valid time

That should all be considered an atomic operation, all or nothing. Doing it
step by step is very easy and natural with REST API, but trying to roll back
on error is flaky.

I think could batch it, but from a programming style it becomes pretty
unnatural. Same thing with a plugin for doing the steps. The natural flow of
code client side gets distorted by having to collect a lot of data upfront
and then provide all that data to a method call. It's doable, just doesn't
seem ideal.

Using an embedded db, exposing as some sort of service etc is also doable,
it's just that my domain is graph related and I'm pretty happy with just the
primitives and using a remote server (if I could have transactions).
Number of clients are quite a few and need to share their data + don't all
run all the time so can't make the client API the embedded api.

I'd think it's not an uncommon situation and many people wishing for a
support for natural client side transaction API (similar to embedded api).

Patrik


On Tue, Jul 5, 2011 at 12:27 PM, Patrik Sundberg
patrik.sundb...@gmail.comwrote:

 yeah, harder problem than my first hunch.

 sounds like plugins is the way to go for now, hopefully introduction of
 non-rest protocol with same interface as embedded API in 1.5 will simplify
 things in the future.

 thanks


 On Mon, Jul 4, 2011 at 11:07 PM, Michael Hunger 
 michael.hun...@neotechnology.com wrote:

 Patrick,

 I've already thought long and hard about that.

 The problem is you can't implement that transparently as you can never
 allow code in a second call rely on data derived from a previous one.

 The simplest form that I came up with is a BatchCommand that gets an API
 interface injected that allows requests but doesn't return data.

 The execution of this Batch command would then return a BatchResult with
 all the data acquired during the batch operation.

 Another way would be to inject the normal GraphDatabaseService interface,
 record the invocations in a first phase and then execute the batch command
 again (this time ignoring the inputs but then returning the results) but
 this is bad from a usability perspective.

 One critical issue is the creation of relationships as they depend on the
 correct node-ids of previously created nodes. Jacob already thought about
 some means of referring to previous output data but I think kept away from
 that as we didn't want to make this batch-interface a turing complete
 language.

 So you see, it's not that simple.

 Michael

 Am 27.06.2011 um 20:45 schrieb Patrik Sundberg:

  Hi,
 
  Since there is now possible to send off batches of operations via the
 REST
  interface, I was wondering if anyone has started to look at implementing
  transactions in the java REST client (
  https://github.com/jexp/neo4j-java-rest-binding) ?
 
  It would seem possible, but I can also see it could involve some major
  reorganizing of the internals of the client to make everything aware of
  transactions and submit via batch command.
 
  Patrik
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user



___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] REST batch support - transaction support for java rest client?

2011-07-07 Thread Michael Hunger
But then it would be possible to write a RequestFilter for the Neo4j-Server
that does start and commit/rollbacks transactions.

I.e. you create a tx object and put it in the session-context if there is none 
and return a tx-token that the filter uses (e.g. as header-field).
then later you can pull it out again and attach it to the current thread 
(that's the tricky part).
On commit or rollback you just do that with the tx (after attaching it to the 
thread).

As the RestfulGraphDb and the Filter share the same execution thread this 
could/should work.

I wouldn't want to support that in the neo4j server by default as this creates 
a lot of server-side state that has to be managed.

But if it works out one could publish that as server-extension.

HTH

Michael

Am 07.07.2011 um 13:30 schrieb Patrik Sundberg:

 Following up on the topic of transactions for client API.
 
 What is the current plan for some sort of client side API supporting
 transactions?
 
 I'm playing around with some ideas here and the lack of transaction support
 in the client API is problematic. I know there's BATCH support in the REST
 API which effectively is a transaction, but it doesn't always suit. For
 example I have the following steps that I'd like to accomplish:
 - create a reference node
 - check if a node with a given domain id exist in an index, if it does, fail
 - create an entity node for the given domain id
 - add entity node to the index
 - attach entity node to ref node
 - create a node representing a specific version of the entity node
 - attach the version node to the entity node, with some properties on the
 relationships signifying valid time
 
 That should all be considered an atomic operation, all or nothing. Doing it
 step by step is very easy and natural with REST API, but trying to roll back
 on error is flaky.
 
 I think could batch it, but from a programming style it becomes pretty
 unnatural. Same thing with a plugin for doing the steps. The natural flow of
 code client side gets distorted by having to collect a lot of data upfront
 and then provide all that data to a method call. It's doable, just doesn't
 seem ideal.
 
 Using an embedded db, exposing as some sort of service etc is also doable,
 it's just that my domain is graph related and I'm pretty happy with just the
 primitives and using a remote server (if I could have transactions).
 Number of clients are quite a few and need to share their data + don't all
 run all the time so can't make the client API the embedded api.
 
 I'd think it's not an uncommon situation and many people wishing for a
 support for natural client side transaction API (similar to embedded api).
 
 Patrik
 
 
 On Tue, Jul 5, 2011 at 12:27 PM, Patrik Sundberg
 patrik.sundb...@gmail.comwrote:
 
 yeah, harder problem than my first hunch.
 
 sounds like plugins is the way to go for now, hopefully introduction of
 non-rest protocol with same interface as embedded API in 1.5 will simplify
 things in the future.
 
 thanks
 
 
 On Mon, Jul 4, 2011 at 11:07 PM, Michael Hunger 
 michael.hun...@neotechnology.com wrote:
 
 Patrick,
 
 I've already thought long and hard about that.
 
 The problem is you can't implement that transparently as you can never
 allow code in a second call rely on data derived from a previous one.
 
 The simplest form that I came up with is a BatchCommand that gets an API
 interface injected that allows requests but doesn't return data.
 
 The execution of this Batch command would then return a BatchResult with
 all the data acquired during the batch operation.
 
 Another way would be to inject the normal GraphDatabaseService interface,
 record the invocations in a first phase and then execute the batch command
 again (this time ignoring the inputs but then returning the results) but
 this is bad from a usability perspective.
 
 One critical issue is the creation of relationships as they depend on the
 correct node-ids of previously created nodes. Jacob already thought about
 some means of referring to previous output data but I think kept away from
 that as we didn't want to make this batch-interface a turing complete
 language.
 
 So you see, it's not that simple.
 
 Michael
 
 Am 27.06.2011 um 20:45 schrieb Patrik Sundberg:
 
 Hi,
 
 Since there is now possible to send off batches of operations via the
 REST
 interface, I was wondering if anyone has started to look at implementing
 transactions in the java REST client (
 https://github.com/jexp/neo4j-java-rest-binding) ?
 
 It would seem possible, but I can also see it could involve some major
 reorganizing of the internals of the client to make everything aware of
 transactions and submit via batch command.
 
 Patrik
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 
 
 

Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Andrew White
I use the shell as-is, but the messages.log is reporting...

 Physical mem: 3962MB, Heap size: 881MB

My point is that if you ignore caching altogether, why did one run take 
17x longer with only 2.4x more data? Considering this is a rather 
iterative algorithm, I don't see why you would even read a node or 
relationship more than once and thus a cache shouldn't matter at all.

In this particular case, I can't imagine taking 9+ minutes to read a 
mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an 
artifact of Cypher in which it is building a set of Rs before applying 
`count` rather than making count accept an iterable stream.

Andrew

On 07/06/2011 11:33 PM, David Montag wrote:
 Hi Andrew,

 How big is your configured Java heap? It could be that all the nodes and
 relationships don't fit into the cache.

 David

 On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net  wrote:

 Here is some interesting stats to consider. First, I split my nodes into
 two groups, one node with 1.4M children and the other with 3.4M
 children. While I do see some cache warm-up improvements, the
 transversal doesn't seem to scale linearly; ie the larger super-node has
 2.4x more children but takes 17x longer to transverse.

 neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 1468486  |
 +--+
 1 rows, 25724 ms
 neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 1468486  |
 +--+
 1 rows, 19763 ms

 neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 3472174  |
 +--+
 1 rows, 565448 ms
 neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 3472174  |
 +--+
 1 rows, 337975 ms

 Any ideas on this?
 Andrew

 On 07/06/2011 09:55 AM, Peter Neubauer wrote:
 Andrew,
 if you upgrade to 1.4.M06, your shell should be able to do Cypher in
 order to count the relationships of a node, not returning them:

 start n=(1) match (n)-[r]-(x) return count(r)

 and try that several times to see if cold caches are initially slowing
 down things.

 or something along these lines. In the LS and Neoclipse the output and
 visualization will be slow for that amount of data.

 Cheers,

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 http://www.neo4j.org   - Your high performance graph
 database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



 On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net
   wrote:
 I have a graph with roughly 10M nodes. Some of these nodes are highly
 connected to other nodes. For example I may have a single node with 1M+
 relationships. A good analogy is a population that has a  lives-in
 relationship to a state. Now the problem...

 Both neoclipse or neo4j-shell are terribly slow when working with these
 nodes. In the shell I would expect a `cdnode-id` to be very fast,
 much like selecting via a rowid in a standard DB. Instead, I usually see
 several seconds delay. Doing a `ls` takes so long that I usually have to
 just kill the process. In fact `ls` never outputs anything which is odd
 since I would expect it to stream the output as it found it. I have
 very similar performance issues with neoclipse.

 I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM.
 Disclaimer, I am new to Neo4j.

 Thanks,
 Andrew
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Indexed relationships

2011-07-07 Thread Niels Hoogeveen

Hi Michael,

I haven't yet worked on an example. 

There are tests for the SortedTree implementation, 
but didn't add those to the IndexedRelationship class, 
which is simply a wrapper around SortedTree. 
Having a test would have caught the error 
that no relationship to the treeNode was created 
(fixed that bug and pushed it to Git) 
(note to self: always create a unit test, 
especially when code seems trivial).

There is no relationship expander that uses this. 
The RelationshipExpander has a method IterableRelationship expand(Node node) 
which cannot be supported, since there is no direct relationship from startnode 
to endnode. 
Instead there is a path through the index tree. 

It's not possible to support the original relationship-traversal API 
since the IndexedRelationship class is not a wrapper around a node, 
but a wrapper around the relationships of a certain RelationshipType in the 
OUTGOING direction. 

As to the name of the class. 
It is essentially an indexed relationship, 
and not just a solution to the densely-connected-node problem. 
An indexed relationship can also be used to maintain 
a sorted set of relationships of any size, 
and can be used to guarantee unicity constraints. Niels
 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 13:27:00 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships
 
 Good work,
 
 do you have an example ready (and/or some tests that show how it works/is 
 used) ?
 
 In creation, manual traversal and automatic traversal (i.e. is there a 
 RelationshipExpander that uses it).
 
 And in the constructor if there is no relationship to the treeNode, you 
 create a new one, but that new treeNode is not connected to the actual node?
 
 I'm not sure if it should support the original relationship-traversal API / 
 methods (getRelationships(Dir,type), etc).
 
 Perhaps that IndexedRelationship should rather be just a wrapper around a 
 SuperNode ? So probably rename it to SuperNode(Wrapper) or 
 HeavilyConnectedNode(Wrapper) ?)
 
 Cheers
 
 Michael
 
 Am 07.07.2011 um 12:51 schrieb Niels Hoogeveen:
 
  
  Finished the implementation of indexed relationships. The graph collections 
  component now contains the package 
  https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship,
   containing the IndexedRelationship class.
  This class can be used instead of regular relationships when:relationships 
  need to be stored in a particular sort ordera unicity constraint needs to 
  be guaranteed nodes become densely populated with relationships.
  The implementation is traverser friendly. Given a start nodes all end nodes 
  can be found by following four relationships types in outgoing direction. 
  Given an end node the start node can be found by following these four 
  relationship types in incoming direction. Of course this functionality is 
  also covered in the API.
  Niels
  
  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Date: Thu, 7 Jul 2011 02:36:29 +0200
  Subject: Re: [Neo4j] Indexed relationships
  
  
  Pushed SortedTree to Git after adding a unit test and doing some debugging.
  TODO:Add API for indexed relationships using SortedTree as the 
  implementation.Make SortedTree thread safe.
  With regard to the latter issue. I am considering the following solution. 
  Acquire a lock (delete a non existent property) on the node that points to 
  the root of the tree at the start of AddNode, RemoveNode and Delete. No 
  other node in the SortedTree is really stable, even the rootnode may be 
  moved down, turning another node into the new rootnode, while after a 
  couple of remove actions the original rootnode may even be deleted. 
  Locking the node pointing to the rootnode, prevents all other 
  threads/transactions from updating the tree. This may seem restrictive, 
  but a single new entry or a single removal may in fact have impact on much 
  of the tree, due to balancing. More selective locking would require a 
  prebalancing tree walk, determining the affected subtrees, lock them and 
  once every affected subtree is locked, perform the actual balancing. 
  Please let me hear if there are any objections to locking the node 
  pointing to the tree as the a solution to make SortedTree thread safe.
  Niels
  
  Date: Tue, 5 Jul 2011 08:27:57 +0200
  From: neubauer.pe...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Indexed relationships
  
  Great work Nils!
  
  /peter
  
  Sent from my phone.
  On Jul 4, 2011 11:39 PM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:
  
  Made some more changes to the SortedTree implementation. Previously
  SortedTree would throw an exception if a duplicate entry was being added.
  I changed SortedTree to allow a key to point to more than one node, 
  unless
  the SortedTree is created as a unique index, in which case an exception is
  raised when an attempt is made to add a node to an existing key 

Re: [Neo4j] REST batch support - transaction support for java rest client?

2011-07-07 Thread Patrik Sundberg
good idea. i'll ponder it for a bit.

but yes, we clearly need to keep state around, so for REST it'd be carried
around in session. but on server side I guess you have issues with never
ending transactions, how to cull them, etc. since it's a stateless
req/response comm channel. on a permanent channel it's easy to detect
disconnect and clean up, over http not as easy.

thanks


On Thu, Jul 7, 2011 at 12:42 PM, Michael Hunger 
michael.hun...@neotechnology.com wrote:

 But then it would be possible to write a RequestFilter for the Neo4j-Server
 that does start and commit/rollbacks transactions.

 I.e. you create a tx object and put it in the session-context if there is
 none and return a tx-token that the filter uses (e.g. as header-field).
 then later you can pull it out again and attach it to the current thread
 (that's the tricky part).
 On commit or rollback you just do that with the tx (after attaching it to
 the thread).

 As the RestfulGraphDb and the Filter share the same execution thread this
 could/should work.

 I wouldn't want to support that in the neo4j server by default as this
 creates a lot of server-side state that has to be managed.

 But if it works out one could publish that as server-extension.

 HTH

 Michael

 Am 07.07.2011 um 13:30 schrieb Patrik Sundberg:

  Following up on the topic of transactions for client API.
 
  What is the current plan for some sort of client side API supporting
  transactions?
 
  I'm playing around with some ideas here and the lack of transaction
 support
  in the client API is problematic. I know there's BATCH support in the
 REST
  API which effectively is a transaction, but it doesn't always suit. For
  example I have the following steps that I'd like to accomplish:
  - create a reference node
  - check if a node with a given domain id exist in an index, if it does,
 fail
  - create an entity node for the given domain id
  - add entity node to the index
  - attach entity node to ref node
  - create a node representing a specific version of the entity node
  - attach the version node to the entity node, with some properties on the
  relationships signifying valid time
 
  That should all be considered an atomic operation, all or nothing. Doing
 it
  step by step is very easy and natural with REST API, but trying to roll
 back
  on error is flaky.
 
  I think could batch it, but from a programming style it becomes pretty
  unnatural. Same thing with a plugin for doing the steps. The natural flow
 of
  code client side gets distorted by having to collect a lot of data
 upfront
  and then provide all that data to a method call. It's doable, just
 doesn't
  seem ideal.
 
  Using an embedded db, exposing as some sort of service etc is also
 doable,
  it's just that my domain is graph related and I'm pretty happy with just
 the
  primitives and using a remote server (if I could have transactions).
  Number of clients are quite a few and need to share their data + don't
 all
  run all the time so can't make the client API the embedded api.
 
  I'd think it's not an uncommon situation and many people wishing for a
  support for natural client side transaction API (similar to embedded
 api).
 
  Patrik
 
 
  On Tue, Jul 5, 2011 at 12:27 PM, Patrik Sundberg
  patrik.sundb...@gmail.comwrote:
 
  yeah, harder problem than my first hunch.
 
  sounds like plugins is the way to go for now, hopefully introduction of
  non-rest protocol with same interface as embedded API in 1.5 will
 simplify
  things in the future.
 
  thanks
 
 
  On Mon, Jul 4, 2011 at 11:07 PM, Michael Hunger 
  michael.hun...@neotechnology.com wrote:
 
  Patrick,
 
  I've already thought long and hard about that.
 
  The problem is you can't implement that transparently as you can never
  allow code in a second call rely on data derived from a previous one.
 
  The simplest form that I came up with is a BatchCommand that gets an
 API
  interface injected that allows requests but doesn't return data.
 
  The execution of this Batch command would then return a BatchResult
 with
  all the data acquired during the batch operation.
 
  Another way would be to inject the normal GraphDatabaseService
 interface,
  record the invocations in a first phase and then execute the batch
 command
  again (this time ignoring the inputs but then returning the results)
 but
  this is bad from a usability perspective.
 
  One critical issue is the creation of relationships as they depend on
 the
  correct node-ids of previously created nodes. Jacob already thought
 about
  some means of referring to previous output data but I think kept away
 from
  that as we didn't want to make this batch-interface a turing complete
  language.
 
  So you see, it's not that simple.
 
  Michael
 
  Am 27.06.2011 um 20:45 schrieb Patrik Sundberg:
 
  Hi,
 
  Since there is now possible to send off batches of operations via the
  REST
  interface, I was wondering if anyone has started to look at
 implementing
  transactions in the 

Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Agelos Pikoulas
I think its the same problem pattern that been in discussion lately with
dense nodes or supernodes (check
http://lists.neo4j.org/pipermail/user/2011-July/009832.html).

Michael Hunger has provided a quick solution to visiting the *few*
RelationshipTypes on a node that has *millions* of others, utilizing a
RelationshipExpander with an Index (check
http://paste.pocoo.org/show/traM5oY1ng7dRQAaf1oV/)

Ideally this would be abstracted  implemented in the core distribution so
that all API's (including Cypher  tinkerpop Pipes/Gremlin) can use it
efficiently...

Agelos

On Thu, Jul 7, 2011 at 3:16 PM, Andrew White li...@andrewewhite.net wrote:

 I use the shell as-is, but the messages.log is reporting...

 Physical mem: 3962MB, Heap size: 881MB

 My point is that if you ignore caching altogether, why did one run take
 17x longer with only 2.4x more data? Considering this is a rather
 iterative algorithm, I don't see why you would even read a node or
 relationship more than once and thus a cache shouldn't matter at all.

 In this particular case, I can't imagine taking 9+ minutes to read a
 mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an
 artifact of Cypher in which it is building a set of Rs before applying
 `count` rather than making count accept an iterable stream.

 Andrew

 On 07/06/2011 11:33 PM, David Montag wrote:
  Hi Andrew,
 
  How big is your configured Java heap? It could be that all the nodes and
  relationships don't fit into the cache.
 
  David
 
  On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net
  wrote:
 
  Here is some interesting stats to consider. First, I split my nodes into
  two groups, one node with 1.4M children and the other with 3.4M
  children. While I do see some cache warm-up improvements, the
  transversal doesn't seem to scale linearly; ie the larger super-node has
  2.4x more children but takes 17x longer to transverse.
 
  neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 1468486  |
  +--+
  1 rows, 25724 ms
  neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 1468486  |
  +--+
  1 rows, 19763 ms
 
  neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 3472174  |
  +--+
  1 rows, 565448 ms
  neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 3472174  |
  +--+
  1 rows, 337975 ms
 
  Any ideas on this?
  Andrew
 
  On 07/06/2011 09:55 AM, Peter Neubauer wrote:
  Andrew,
  if you upgrade to 1.4.M06, your shell should be able to do Cypher in
  order to count the relationships of a node, not returning them:
 
  start n=(1) match (n)-[r]-(x) return count(r)
 
  and try that several times to see if cold caches are initially slowing
  down things.
 
  or something along these lines. In the LS and Neoclipse the output and
  visualization will be slow for that amount of data.
 
  Cheers,
 
  /peter neubauer
 
  GTalk:  neubauer.peter
  Skype   peter.neubauer
  Phone   +46 704 106975
  LinkedIn   http://www.linkedin.com/in/neubauer
  Twitter  http://twitter.com/peterneubauer
 
  http://www.neo4j.org   - Your high performance graph
  database.
  http://startupbootcamp.org/- Öresund - Innovation happens HERE.
  http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
 party.
 
 
 
  On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net
wrote:
  I have a graph with roughly 10M nodes. Some of these nodes are highly
  connected to other nodes. For example I may have a single node with
 1M+
  relationships. A good analogy is a population that has a  lives-in
  relationship to a state. Now the problem...
 
  Both neoclipse or neo4j-shell are terribly slow when working with
 these
  nodes. In the shell I would expect a `cdnode-id` to be very fast,
  much like selecting via a rowid in a standard DB. Instead, I usually
 see
  several seconds delay. Doing a `ls` takes so long that I usually have
 to
  just kill the process. In fact `ls` never outputs anything which is
 odd
  since I would expect it to stream the output as it found it. I have
  very similar performance issues with neoclipse.
 
  I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM.
  Disclaimer, I am new to Neo4j.
 
  Thanks,
  Andrew
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 
 

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 

[Neo4j] Unique Constaint on Index

2011-07-07 Thread etc3
We are testing Neo4J and need to support unique emails across all users. Is
this possible with the current API? 

 

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread Marko Rodriguez
Hi,

 We are testing Neo4J and need to support unique emails across all users. Is
 this possible with the current API? 

You can add such a constraint when updating the indices:

if(index.get('email', address).hasNext()) {
  throw new RuntimeException(There are two nodes that share the same email 
address.);
} else {
  index.put('email', address, node);
}

Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread etc3
How do I ensure another request is not performing the same operation on
another node in the cluster?


-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
Behalf Of Marko Rodriguez
Sent: Thursday, July 07, 2011 10:35 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Unique Constaint on Index

Hi,

 We are testing Neo4J and need to support unique emails across all 
 users. Is this possible with the current API?

You can add such a constraint when updating the indices:

if(index.get('email', address).hasNext()) {
  throw new RuntimeException(There are two nodes that share the same email
address.); } else {
  index.put('email', address, node);
}

Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Indexed relationships

2011-07-07 Thread Niels Hoogeveen

Hi Michael,I realize that the implementation of IndexedRelationship can in fact 
support returning relationships, and I have a preliminary version running 
locally now.The returned relationships can support all methods of the 
Relationship interface, returning the node pointing to the treeRoot as the 
startNode, and returning the node set as the key_value as the endNode.All 
relationship properties will be stored on the KEY_VALUE relationship pointing 
to the endNode.There is one caveat to this solution, the returned relationships 
cannot support the getId() method,and will throw an 
UnsupportedOperationException when being called.IndexedRelationship will 
implement IterableRelationship.With these changes, it is possible to create 
an Expander and I am working right now to implement that.Niels

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 14:46:35 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Hi Michael,
 
 I haven't yet worked on an example. 
 
 There are tests for the SortedTree implementation, 
 but didn't add those to the IndexedRelationship class, 
 which is simply a wrapper around SortedTree. 
 Having a test would have caught the error 
 that no relationship to the treeNode was created 
 (fixed that bug and pushed it to Git) 
 (note to self: always create a unit test, 
 especially when code seems trivial).
 
 There is no relationship expander that uses this. 
 The RelationshipExpander has a method IterableRelationship expand(Node 
 node) 
 which cannot be supported, since there is no direct relationship from 
 startnode to endnode. 
 Instead there is a path through the index tree. 
 
 It's not possible to support the original relationship-traversal API 
 since the IndexedRelationship class is not a wrapper around a node, 
 but a wrapper around the relationships of a certain RelationshipType in the 
 OUTGOING direction. 
 
 As to the name of the class. 
 It is essentially an indexed relationship, 
 and not just a solution to the densely-connected-node problem. 
 An indexed relationship can also be used to maintain 
 a sorted set of relationships of any size, 
 and can be used to guarantee unicity constraints. Niels
  From: michael.hun...@neotechnology.com
  Date: Thu, 7 Jul 2011 13:27:00 +0200
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Indexed relationships
  
  Good work,
  
  do you have an example ready (and/or some tests that show how it works/is 
  used) ?
  
  In creation, manual traversal and automatic traversal (i.e. is there a 
  RelationshipExpander that uses it).
  
  And in the constructor if there is no relationship to the treeNode, you 
  create a new one, but that new treeNode is not connected to the actual node?
  
  I'm not sure if it should support the original relationship-traversal API / 
  methods (getRelationships(Dir,type), etc).
  
  Perhaps that IndexedRelationship should rather be just a wrapper around a 
  SuperNode ? So probably rename it to SuperNode(Wrapper) or 
  HeavilyConnectedNode(Wrapper) ?)
  
  Cheers
  
  Michael
  
  Am 07.07.2011 um 12:51 schrieb Niels Hoogeveen:
  
   
   Finished the implementation of indexed relationships. The graph 
   collections component now contains the package 
   https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship,
containing the IndexedRelationship class.
   This class can be used instead of regular relationships 
   when:relationships need to be stored in a particular sort ordera unicity 
   constraint needs to be guaranteed nodes become densely populated with 
   relationships.
   The implementation is traverser friendly. Given a start nodes all end 
   nodes can be found by following four relationships types in outgoing 
   direction. Given an end node the start node can be found by following 
   these four relationship types in incoming direction. Of course this 
   functionality is also covered in the API.
   Niels
   
   From: pd_aficion...@hotmail.com
   To: user@lists.neo4j.org
   Date: Thu, 7 Jul 2011 02:36:29 +0200
   Subject: Re: [Neo4j] Indexed relationships
   
   
   Pushed SortedTree to Git after adding a unit test and doing some 
   debugging.
   TODO:Add API for indexed relationships using SortedTree as the 
   implementation.Make SortedTree thread safe.
   With regard to the latter issue. I am considering the following 
   solution. Acquire a lock (delete a non existent property) on the node 
   that points to the root of the tree at the start of AddNode, RemoveNode 
   and Delete. No other node in the SortedTree is really stable, even the 
   rootnode may be moved down, turning another node into the new rootnode, 
   while after a couple of remove actions the original rootnode may even be 
   deleted. 
   Locking the node pointing to the rootnode, prevents all other 
   threads/transactions from updating the tree. This may seem restrictive, 
   but a single new entry or a single removal may 

Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread Chris Gioran
Hi,

the ability to acquire locks cluster-wide exists, albeit in an ad hoc
fashion. Grabbing a write lock on the node you want to ensure is
uniquely indexed will ensure that the operations are serialized across
all cluster members.
The most simple way to get that lock currently is the (somewhat
hackish but entirely correct) removal of a non-existing property.

cheers,
CG

On Thu, Jul 7, 2011 at 5:53 PM, etc3 e...@nextideapartners.com wrote:
 How do I ensure another request is not performing the same operation on
 another node in the cluster?


 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
 Behalf Of Marko Rodriguez
 Sent: Thursday, July 07, 2011 10:35 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Unique Constaint on Index

 Hi,

 We are testing Neo4J and need to support unique emails across all
 users. Is this possible with the current API?

 You can add such a constraint when updating the indices:

 if(index.get('email', address).hasNext()) {
  throw new RuntimeException(There are two nodes that share the same email
 address.); } else {
  index.put('email', address, node);
 }

 Marko.

 http://markorodriguez.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread Niels Hoogeveen

Marko's solution works, because you roll back the transaction once you find a 
duplicate entry.
Another solution to this problem is to use the SortedTree index in 
graph-collections https://github.com/peterneubauer/graph-collections, which has 
a setting that makes an index unique. This component is relatively new and 
could use some proper testing, though.
Niels
 From: e...@nextideapartners.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 10:53:20 -0400
 Subject: Re: [Neo4j] Unique Constaint on Index
 
 How do I ensure another request is not performing the same operation on
 another node in the cluster?
 
 
 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
 Behalf Of Marko Rodriguez
 Sent: Thursday, July 07, 2011 10:35 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Unique Constaint on Index
 
 Hi,
 
  We are testing Neo4J and need to support unique emails across all 
  users. Is this possible with the current API?
 
 You can add such a constraint when updating the indices:
 
 if(index.get('email', address).hasNext()) {
   throw new RuntimeException(There are two nodes that share the same email
 address.); } else {
   index.put('email', address, node);
 }
 
 Marko.
 
 http://markorodriguez.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread Marko Rodriguez
You can use Transactions.

Marko.

 How do I ensure another request is not performing the same operation on
 another node in the cluster?
 
 
 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
 Behalf Of Marko Rodriguez
 Sent: Thursday, July 07, 2011 10:35 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Unique Constaint on Index
 
 Hi,
 
 We are testing Neo4J and need to support unique emails across all 
 users. Is this possible with the current API?
 
 You can add such a constraint when updating the indices:
 
 if(index.get('email', address).hasNext()) {
  throw new RuntimeException(There are two nodes that share the same email
 address.); } else {
  index.put('email', address, node);
 }
 
 Marko.
 
 http://markorodriguez.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] typo in Expander interface

2011-07-07 Thread Niels Hoogeveen

The interface of org.neo4j.graphdb.Expander contains a typo.
The method addRelationsipFilter(Predicate? super Relationship) should be 
called addRelationshipFilter(Predicate? super Relationship).
Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] typo in Expander interface

2011-07-07 Thread Adriano Henrique de Almeida
I've sent a Pull Request long time ago fixing this, but it was to the old
neo4j repository. Guess it wasn't merged.

https://github.com/neo4j/graphdb/pull/2

Can send it again, if the guys want.

2011/7/7 Niels Hoogeveen pd_aficion...@hotmail.com


 The interface of org.neo4j.graphdb.Expander contains a typo.
 The method addRelationsipFilter(Predicate? super Relationship) should be
 called addRelationshipFilter(Predicate? super Relationship).
 Niels
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Adriano Almeida
Caelum | Ensino e Inovação
www.caelum.com.br
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] typo in Expander interface

2011-07-07 Thread Peter Neubauer
Yes,
please do, and send the CLA mail first, see
http://wiki.neo4j.org/content/About_Contributor_License_Agreement

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Thu, Jul 7, 2011 at 5:14 PM, Adriano Henrique de Almeida
adrianoalmei...@gmail.com wrote:
 I've sent a Pull Request long time ago fixing this, but it was to the old
 neo4j repository. Guess it wasn't merged.

 https://github.com/neo4j/graphdb/pull/2

 Can send it again, if the guys want.

 2011/7/7 Niels Hoogeveen pd_aficion...@hotmail.com


 The interface of org.neo4j.graphdb.Expander contains a typo.
 The method addRelationsipFilter(Predicate? super Relationship) should be
 called addRelationshipFilter(Predicate? super Relationship).
 Niels
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




 --
 Adriano Almeida
 Caelum | Ensino e Inovação
 www.caelum.com.br
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread etc3
I'm new to this framework, is there an example that demonstrates removing a
non-existent property and how it would be used in this context?

Thanks

-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
Behalf Of Chris Gioran
Sent: Thursday, July 07, 2011 11:04 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Unique Constaint on Index

Hi,

the ability to acquire locks cluster-wide exists, albeit in an ad hoc
fashion. Grabbing a write lock on the node you want to ensure is uniquely
indexed will ensure that the operations are serialized across all cluster
members.
The most simple way to get that lock currently is the (somewhat hackish but
entirely correct) removal of a non-existing property.

cheers,
CG

On Thu, Jul 7, 2011 at 5:53 PM, etc3 e...@nextideapartners.com wrote:
 How do I ensure another request is not performing the same operation 
 on another node in the cluster?


 -Original Message-
 From: user-boun...@lists.neo4j.org 
 [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez
 Sent: Thursday, July 07, 2011 10:35 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Unique Constaint on Index

 Hi,

 We are testing Neo4J and need to support unique emails across all 
 users. Is this possible with the current API?

 You can add such a constraint when updating the indices:

 if(index.get('email', address).hasNext()) {
  throw new RuntimeException(There are two nodes that share the same 
 email address.); } else {
  index.put('email', address, node);
 }

 Marko.

 http://markorodriguez.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unique Constaint on Index

2011-07-07 Thread Aseem Kishore
I'll strongly +1 that having a concept of unique index values should be
built into Neo4j. It's just too common of a requirement.

Aseem

On Thu, Jul 7, 2011 at 11:48 AM, etc3 e...@nextideapartners.com wrote:

 I'm new to this framework, is there an example that demonstrates removing a
 non-existent property and how it would be used in this context?

 Thanks

 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org]
 On
 Behalf Of Chris Gioran
 Sent: Thursday, July 07, 2011 11:04 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Unique Constaint on Index

 Hi,

 the ability to acquire locks cluster-wide exists, albeit in an ad hoc
 fashion. Grabbing a write lock on the node you want to ensure is uniquely
 indexed will ensure that the operations are serialized across all cluster
 members.
 The most simple way to get that lock currently is the (somewhat hackish but
 entirely correct) removal of a non-existing property.

 cheers,
 CG

 On Thu, Jul 7, 2011 at 5:53 PM, etc3 e...@nextideapartners.com wrote:
  How do I ensure another request is not performing the same operation
  on another node in the cluster?
 
 
  -Original Message-
  From: user-boun...@lists.neo4j.org
  [mailto:user-boun...@lists.neo4j.org] On Behalf Of Marko Rodriguez
  Sent: Thursday, July 07, 2011 10:35 AM
  To: Neo4j user discussions
  Subject: Re: [Neo4j] Unique Constaint on Index
 
  Hi,
 
  We are testing Neo4J and need to support unique emails across all
  users. Is this possible with the current API?
 
  You can add such a constraint when updating the indices:
 
  if(index.get('email', address).hasNext()) {
   throw new RuntimeException(There are two nodes that share the same
  email address.); } else {
   index.put('email', address, node);
  }
 
  Marko.
 
  http://markorodriguez.com
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] typo in Expander interface

2011-07-07 Thread Mattias Persson
I've done some traversal additions, and this also in a branch... pushing
soon!

2011/7/7 Adriano Henrique de Almeida adrianoalmei...@gmail.com

 I've sent a Pull Request long time ago fixing this, but it was to the old
 neo4j repository. Guess it wasn't merged.

 https://github.com/neo4j/graphdb/pull/2

 Can send it again, if the guys want.

 2011/7/7 Niels Hoogeveen pd_aficion...@hotmail.com

 
  The interface of org.neo4j.graphdb.Expander contains a typo.
  The method addRelationsipFilter(Predicate? super Relationship) should
 be
  called addRelationshipFilter(Predicate? super Relationship).
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 



 --
 Adriano Almeida
 Caelum | Ensino e Inovação
 www.caelum.com.br
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Mattias Persson
2011/7/7 Agelos Pikoulas agelos.pikou...@gmail.com

 I think its the same problem pattern that been in discussion lately with
 dense nodes or supernodes (check
 http://lists.neo4j.org/pipermail/user/2011-July/009832.html).

 Michael Hunger has provided a quick solution to visiting the *few*
 RelationshipTypes on a node that has *millions* of others, utilizing a
 RelationshipExpander with an Index (check
 http://paste.pocoo.org/show/traM5oY1ng7dRQAaf1oV/)

 Ideally this would be abstracted  implemented in the core distribution so
 that all API's (including Cypher  tinkerpop Pipes/Gremlin) can use it
 efficiently...


Yes, I'm positive that something will be done on a core level to make
getting relationships of a specific type regardless of the total number of
relationships fast. In the foreseeable future hopefully.


 Agelos

 On Thu, Jul 7, 2011 at 3:16 PM, Andrew White li...@andrewewhite.net
 wrote:

  I use the shell as-is, but the messages.log is reporting...
 
  Physical mem: 3962MB, Heap size: 881MB
 
  My point is that if you ignore caching altogether, why did one run take
  17x longer with only 2.4x more data? Considering this is a rather
  iterative algorithm, I don't see why you would even read a node or
  relationship more than once and thus a cache shouldn't matter at all.
 
  In this particular case, I can't imagine taking 9+ minutes to read a
  mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an
  artifact of Cypher in which it is building a set of Rs before applying
  `count` rather than making count accept an iterable stream.
 
  Andrew
 
  On 07/06/2011 11:33 PM, David Montag wrote:
   Hi Andrew,
  
   How big is your configured Java heap? It could be that all the nodes
 and
   relationships don't fit into the cache.
  
   David
  
   On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net
   wrote:
  
   Here is some interesting stats to consider. First, I split my nodes
 into
   two groups, one node with 1.4M children and the other with 3.4M
   children. While I do see some cache warm-up improvements, the
   transversal doesn't seem to scale linearly; ie the larger super-node
 has
   2.4x more children but takes 17x longer to transverse.
  
   neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
   +--+
   | count(r) |
   +--+
   | 1468486  |
   +--+
   1 rows, 25724 ms
   neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
   +--+
   | count(r) |
   +--+
   | 1468486  |
   +--+
   1 rows, 19763 ms
  
   neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
   +--+
   | count(r) |
   +--+
   | 3472174  |
   +--+
   1 rows, 565448 ms
   neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
   +--+
   | count(r) |
   +--+
   | 3472174  |
   +--+
   1 rows, 337975 ms
  
   Any ideas on this?
   Andrew
  
   On 07/06/2011 09:55 AM, Peter Neubauer wrote:
   Andrew,
   if you upgrade to 1.4.M06, your shell should be able to do Cypher in
   order to count the relationships of a node, not returning them:
  
   start n=(1) match (n)-[r]-(x) return count(r)
  
   and try that several times to see if cold caches are initially
 slowing
   down things.
  
   or something along these lines. In the LS and Neoclipse the output
 and
   visualization will be slow for that amount of data.
  
   Cheers,
  
   /peter neubauer
  
   GTalk:  neubauer.peter
   Skype   peter.neubauer
   Phone   +46 704 106975
   LinkedIn   http://www.linkedin.com/in/neubauer
   Twitter  http://twitter.com/peterneubauer
  
   http://www.neo4j.org   - Your high performance graph
   database.
   http://startupbootcamp.org/- Öresund - Innovation happens HERE.
   http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
  party.
  
  
  
   On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net
 wrote:
   I have a graph with roughly 10M nodes. Some of these nodes are
 highly
   connected to other nodes. For example I may have a single node with
  1M+
   relationships. A good analogy is a population that has a  lives-in
   relationship to a state. Now the problem...
  
   Both neoclipse or neo4j-shell are terribly slow when working with
  these
   nodes. In the shell I would expect a `cdnode-id` to be very fast,
   much like selecting via a rowid in a standard DB. Instead, I usually
  see
   several seconds delay. Doing a `ls` takes so long that I usually
 have
  to
   just kill the process. In fact `ls` never outputs anything which is
  odd
   since I would expect it to stream the output as it found it. I
 have
   very similar performance issues with neoclipse.
  
   I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM.
   Disclaimer, I am new to Neo4j.
  
   Thanks,
   Andrew
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   

[Neo4j] Add relationships dynamically

2011-07-07 Thread noppanit
Is there anyway I can add relationships on-the-fly or programmatically?
Because sometime I might not know the relationships and I want to add that
to the database. 

Cheers,
 

--
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Add-relationships-dynamically-tp3149437p3149437.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Indexed relationships

2011-07-07 Thread Niels Hoogeveen

IndexedRelationship and IndexedRelationshipExpander are now in Git. See: 
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
An example:
class IdComparator implements java.util.ComparatorNode{
  public int compare(Node n1, Node n2){
long l1 = Long.reverse(n1.getId());
long l2 = Long.reverse(n2.getId());
if(l1 == l2) return 0;
else if(l1  l2) return -1;
else return 1;
  }
}static enum RelTypes implements RelationshipType{
  DIRECT_RELATIONSHIP,
  INDEXED_RELATIONSHIP,
};
Node indexedNode = graphDb().createNode();
IndexedRelationship ir = new IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, 
Direction.OUTGOING, new IdComparator(), true, indexedNode, graphDb());

Node n1 = graphDb().createNode();
n1.setProperty(name, n1);
Node n2 = graphDb().createNode();
n2.setProperty(name, n2);
Node n3 = graphDb().createNode();
n3.setProperty(name, n3);
Node n4 = graphDb().createNode();
n4.setProperty(name, n4);

indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
ir.createRelationshipTo(n2);
ir.createRelationshipTo(n4);

IndexedRelationshipExpander re1 = new IndexedRelationshipExpander(graphDb(), 
Direction.OUTGOING, RelTypes.DIRECT_RELATIONSHIP);
IndexedRelationshipExpander re2 = new IndexedRelationshipExpander(graphDb(), 
Direction.OUTGOING, RelTypes.INDEXED_RELATIONSHIP);

for(Relationship rel: re1.expand(indexedNode)){
  System.out.println(rel.getEndNode().getProperty(name));
}
for(Relationship rel: re2.expand(indexedNode)){
  System.out.println(re2.getEndNode().getProperty(name));
}
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 16:55:36 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Hi Michael,I realize that the implementation of IndexedRelationship can in 
 fact support returning relationships, and I have a preliminary version 
 running locally now.The returned relationships can support all methods of the 
 Relationship interface, returning the node pointing to the treeRoot as the 
 startNode, and returning the node set as the key_value as the endNode.All 
 relationship properties will be stored on the KEY_VALUE relationship pointing 
 to the endNode.There is one caveat to this solution, the returned 
 relationships cannot support the getId() method,and will throw an 
 UnsupportedOperationException when being called.IndexedRelationship will 
 implement IterableRelationship.With these changes, it is possible to create 
 an Expander and I am working right now to implement that.Niels
 
  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Date: Thu, 7 Jul 2011 14:46:35 +0200
  Subject: Re: [Neo4j] Indexed relationships
  
  
  Hi Michael,
  
  I haven't yet worked on an example. 
  
  There are tests for the SortedTree implementation, 
  but didn't add those to the IndexedRelationship class, 
  which is simply a wrapper around SortedTree. 
  Having a test would have caught the error 
  that no relationship to the treeNode was created 
  (fixed that bug and pushed it to Git) 
  (note to self: always create a unit test, 
  especially when code seems trivial).
  
  There is no relationship expander that uses this. 
  The RelationshipExpander has a method IterableRelationship expand(Node 
  node) 
  which cannot be supported, since there is no direct relationship from 
  startnode to endnode. 
  Instead there is a path through the index tree. 
  
  It's not possible to support the original relationship-traversal API 
  since the IndexedRelationship class is not a wrapper around a node, 
  but a wrapper around the relationships of a certain RelationshipType in the 
  OUTGOING direction. 
  
  As to the name of the class. 
  It is essentially an indexed relationship, 
  and not just a solution to the densely-connected-node problem. 
  An indexed relationship can also be used to maintain 
  a sorted set of relationships of any size, 
  and can be used to guarantee unicity constraints. Niels
   From: michael.hun...@neotechnology.com
   Date: Thu, 7 Jul 2011 13:27:00 +0200
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Indexed relationships
   
   Good work,
   
   do you have an example ready (and/or some tests that show how it works/is 
   used) ?
   
   In creation, manual traversal and automatic traversal (i.e. is there a 
   RelationshipExpander that uses it).
   
   And in the constructor if there is no relationship to the treeNode, you 
   create a new one, but that new treeNode is not connected to the actual 
   node?
   
   I'm not sure if it should support the original relationship-traversal API 
   / methods (getRelationships(Dir,type), etc).
   
   Perhaps that IndexedRelationship should rather be just a wrapper around a 
   SuperNode ? So probably rename it to SuperNode(Wrapper) or 
   HeavilyConnectedNode(Wrapper) ?)
   
   

Re: [Neo4j] Add relationships dynamically

2011-07-07 Thread Rick Bullotta
Take a look at the RelationshipType interface.  If you implement that (which is 
really simple - just a name() property), you can have your own class that can 
have relationships with any names you want.  They do need to be unique, however.


From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of 
noppanit [noppani...@gmail.com]
Sent: Thursday, July 07, 2011 4:01 PM
To: user@lists.neo4j.org
Subject: [Neo4j] Add relationships dynamically

Is there anyway I can add relationships on-the-fly or programmatically?
Because sometime I might not know the relationships and I want to add that
to the database.

Cheers,


--
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Add-relationships-dynamically-tp3149437p3149437.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Add relationships dynamically

2011-07-07 Thread kriti sharma
Also take a look at DynamicRelationshipTypes if you want to instantiate
relationship types at runtime.

On Thu, Jul 7, 2011 at 9:16 PM, Rick Bullotta
rick.bullo...@thingworx.comwrote:

 Take a look at the RelationshipType interface.  If you implement that
 (which is really simple - just a name() property), you can have your own
 class that can have relationships with any names you want.  They do need to
 be unique, however.

 
 From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On
 Behalf Of noppanit [noppani...@gmail.com]
 Sent: Thursday, July 07, 2011 4:01 PM
 To: user@lists.neo4j.org
 Subject: [Neo4j] Add relationships dynamically

 Is there anyway I can add relationships on-the-fly or programmatically?
 Because sometime I might not know the relationships and I want to add that
 to the database.

 Cheers,


 --
 View this message in context:
 http://neo4j-user-list.438527.n3.nabble.com/Add-relationships-dynamically-tp3149437p3149437.html
 Sent from the Neo4J User List mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Niels Hoogeveen

I am glad to see a solution will be provided at the core level. 
Today, I pushed IndexedRelationships and IndexedRelationshipExpander to Git, 
see: 
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
This provides a solution to the issue, but is certainly not as fast as a 
solution in core would be. 
However, it does solve my issues and as a bonus, indexed relationships can be 
traversed in sorted order,this is especially pleasant, since I usually want to 
know only the recent additions of dense relationships.
Niels


 Date: Thu, 7 Jul 2011 21:37:26 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships
 
 2011/7/7 Agelos Pikoulas agelos.pikou...@gmail.com
 
  I think its the same problem pattern that been in discussion lately with
  dense nodes or supernodes (check
  http://lists.neo4j.org/pipermail/user/2011-July/009832.html).
 
  Michael Hunger has provided a quick solution to visiting the *few*
  RelationshipTypes on a node that has *millions* of others, utilizing a
  RelationshipExpander with an Index (check
  http://paste.pocoo.org/show/traM5oY1ng7dRQAaf1oV/)
 
  Ideally this would be abstracted  implemented in the core distribution so
  that all API's (including Cypher  tinkerpop Pipes/Gremlin) can use it
  efficiently...
 
 
 Yes, I'm positive that something will be done on a core level to make
 getting relationships of a specific type regardless of the total number of
 relationships fast. In the foreseeable future hopefully.
 
 
  Agelos
 
  On Thu, Jul 7, 2011 at 3:16 PM, Andrew White li...@andrewewhite.net
  wrote:
 
   I use the shell as-is, but the messages.log is reporting...
  
   Physical mem: 3962MB, Heap size: 881MB
  
   My point is that if you ignore caching altogether, why did one run take
   17x longer with only 2.4x more data? Considering this is a rather
   iterative algorithm, I don't see why you would even read a node or
   relationship more than once and thus a cache shouldn't matter at all.
  
   In this particular case, I can't imagine taking 9+ minutes to read a
   mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an
   artifact of Cypher in which it is building a set of Rs before applying
   `count` rather than making count accept an iterable stream.
  
   Andrew
  
   On 07/06/2011 11:33 PM, David Montag wrote:
Hi Andrew,
   
How big is your configured Java heap? It could be that all the nodes
  and
relationships don't fit into the cache.
   
David
   
On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net
wrote:
   
Here is some interesting stats to consider. First, I split my nodes
  into
two groups, one node with 1.4M children and the other with 3.4M
children. While I do see some cache warm-up improvements, the
transversal doesn't seem to scale linearly; ie the larger super-node
  has
2.4x more children but takes 17x longer to transverse.
   
neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
+--+
| count(r) |
+--+
| 1468486  |
+--+
1 rows, 25724 ms
neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
+--+
| count(r) |
+--+
| 1468486  |
+--+
1 rows, 19763 ms
   
neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
+--+
| count(r) |
+--+
| 3472174  |
+--+
1 rows, 565448 ms
neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
+--+
| count(r) |
+--+
| 3472174  |
+--+
1 rows, 337975 ms
   
Any ideas on this?
Andrew
   
On 07/06/2011 09:55 AM, Peter Neubauer wrote:
Andrew,
if you upgrade to 1.4.M06, your shell should be able to do Cypher in
order to count the relationships of a node, not returning them:
   
start n=(1) match (n)-[r]-(x) return count(r)
   
and try that several times to see if cold caches are initially
  slowing
down things.
   
or something along these lines. In the LS and Neoclipse the output
  and
visualization will be slow for that amount of data.
   
Cheers,
   
/peter neubauer
   
GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer
   
http://www.neo4j.org   - Your high performance graph
database.
http://startupbootcamp.org/- Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
   party.
   
   
   
On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net
  wrote:
I have a graph with roughly 10M nodes. Some of these nodes are
  highly
connected to other nodes. For example I may have a single node 

Re: [Neo4j] Indexed relationships

2011-07-07 Thread Michael Hunger
Could you put these code examples into the Readme for the project or on a wiki 
page?

Am 07.07.2011 um 22:11 schrieb Niels Hoogeveen:

 
 IndexedRelationship and IndexedRelationshipExpander are now in Git. See: 
 https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
 An example:
 class IdComparator implements java.util.ComparatorNode{
  public int compare(Node n1, Node n2){
long l1 = Long.reverse(n1.getId());
long l2 = Long.reverse(n2.getId());
if(l1 == l2) return 0;
else if(l1  l2) return -1;
else return 1;
  }
 }static enum RelTypes implements RelationshipType{
  DIRECT_RELATIONSHIP,
  INDEXED_RELATIONSHIP,
 };
 Node indexedNode = graphDb().createNode();
 IndexedRelationship ir = new 
 IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, new 
 IdComparator(), true, indexedNode, graphDb());
   
 Node n1 = graphDb().createNode();
 n1.setProperty(name, n1);
 Node n2 = graphDb().createNode();
 n2.setProperty(name, n2);
 Node n3 = graphDb().createNode();
 n3.setProperty(name, n3);
 Node n4 = graphDb().createNode();
 n4.setProperty(name, n4);
   
 indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
 indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
 ir.createRelationshipTo(n2);
 ir.createRelationshipTo(n4);
   
 IndexedRelationshipExpander re1 = new IndexedRelationshipExpander(graphDb(), 
 Direction.OUTGOING, RelTypes.DIRECT_RELATIONSHIP);
 IndexedRelationshipExpander re2 = new IndexedRelationshipExpander(graphDb(), 
 Direction.OUTGOING, RelTypes.INDEXED_RELATIONSHIP);
   
 for(Relationship rel: re1.expand(indexedNode)){
  System.out.println(rel.getEndNode().getProperty(name));
 }
 for(Relationship rel: re2.expand(indexedNode)){
  System.out.println(re2.getEndNode().getProperty(name));
 }
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 16:55:36 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Hi Michael,I realize that the implementation of IndexedRelationship can in 
 fact support returning relationships, and I have a preliminary version 
 running locally now.The returned relationships can support all methods of 
 the Relationship interface, returning the node pointing to the treeRoot as 
 the startNode, and returning the node set as the key_value as the 
 endNode.All relationship properties will be stored on the KEY_VALUE 
 relationship pointing to the endNode.There is one caveat to this solution, 
 the returned relationships cannot support the getId() method,and will throw 
 an UnsupportedOperationException when being called.IndexedRelationship will 
 implement IterableRelationship.With these changes, it is possible to 
 create an Expander and I am working right now to implement that.Niels
 
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 14:46:35 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Hi Michael,
 
 I haven't yet worked on an example. 
 
 There are tests for the SortedTree implementation, 
 but didn't add those to the IndexedRelationship class, 
 which is simply a wrapper around SortedTree. 
 Having a test would have caught the error 
 that no relationship to the treeNode was created 
 (fixed that bug and pushed it to Git) 
 (note to self: always create a unit test, 
 especially when code seems trivial).
 
 There is no relationship expander that uses this. 
 The RelationshipExpander has a method IterableRelationship expand(Node 
 node) 
 which cannot be supported, since there is no direct relationship from 
 startnode to endnode. 
 Instead there is a path through the index tree. 
 
 It's not possible to support the original relationship-traversal API 
 since the IndexedRelationship class is not a wrapper around a node, 
 but a wrapper around the relationships of a certain RelationshipType in the 
 OUTGOING direction. 
 
 As to the name of the class. 
 It is essentially an indexed relationship, 
 and not just a solution to the densely-connected-node problem. 
 An indexed relationship can also be used to maintain 
 a sorted set of relationships of any size, 
 and can be used to guarantee unicity constraints. Niels
 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 13:27:00 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships
 
 Good work,
 
 do you have an example ready (and/or some tests that show how it works/is 
 used) ?
 
 In creation, manual traversal and automatic traversal (i.e. is there a 
 RelationshipExpander that uses it).
 
 And in the constructor if there is no relationship to the treeNode, you 
 create a new one, but that new treeNode is not connected to the actual 
 node?
 
 I'm not sure if it should support the original relationship-traversal API 
 / methods (getRelationships(Dir,type), etc).
 
 Perhaps that IndexedRelationship should rather be just a wrapper around a 
 SuperNode ? So probably 

Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Michael Hunger
Niels could you perhaps write up a blog post detailing the usage (also for your 
own scenario and how that solution would compare to the naive supernodes with 
just millions of relationships.

Also I'd like to see a performance comparision of both approaches.

Thanks so much for your work

Michael

Am 07.07.2011 um 22:24 schrieb Niels Hoogeveen:

 
 I am glad to see a solution will be provided at the core level. 
 Today, I pushed IndexedRelationships and IndexedRelationshipExpander to Git, 
 see: 
 https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
 This provides a solution to the issue, but is certainly not as fast as a 
 solution in core would be. 
 However, it does solve my issues and as a bonus, indexed relationships can be 
 traversed in sorted order,this is especially pleasant, since I usually want 
 to know only the recent additions of dense relationships.
 Niels
 
 
 Date: Thu, 7 Jul 2011 21:37:26 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships
 
 2011/7/7 Agelos Pikoulas agelos.pikou...@gmail.com
 
 I think its the same problem pattern that been in discussion lately with
 dense nodes or supernodes (check
 http://lists.neo4j.org/pipermail/user/2011-July/009832.html).
 
 Michael Hunger has provided a quick solution to visiting the *few*
 RelationshipTypes on a node that has *millions* of others, utilizing a
 RelationshipExpander with an Index (check
 http://paste.pocoo.org/show/traM5oY1ng7dRQAaf1oV/)
 
 Ideally this would be abstracted  implemented in the core distribution so
 that all API's (including Cypher  tinkerpop Pipes/Gremlin) can use it
 efficiently...
 
 
 Yes, I'm positive that something will be done on a core level to make
 getting relationships of a specific type regardless of the total number of
 relationships fast. In the foreseeable future hopefully.
 
 
 Agelos
 
 On Thu, Jul 7, 2011 at 3:16 PM, Andrew White li...@andrewewhite.net
 wrote:
 
 I use the shell as-is, but the messages.log is reporting...
 
Physical mem: 3962MB, Heap size: 881MB
 
 My point is that if you ignore caching altogether, why did one run take
 17x longer with only 2.4x more data? Considering this is a rather
 iterative algorithm, I don't see why you would even read a node or
 relationship more than once and thus a cache shouldn't matter at all.
 
 In this particular case, I can't imagine taking 9+ minutes to read a
 mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an
 artifact of Cypher in which it is building a set of Rs before applying
 `count` rather than making count accept an iterable stream.
 
 Andrew
 
 On 07/06/2011 11:33 PM, David Montag wrote:
 Hi Andrew,
 
 How big is your configured Java heap? It could be that all the nodes
 and
 relationships don't fit into the cache.
 
 David
 
 On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net
 wrote:
 
 Here is some interesting stats to consider. First, I split my nodes
 into
 two groups, one node with 1.4M children and the other with 3.4M
 children. While I do see some cache warm-up improvements, the
 transversal doesn't seem to scale linearly; ie the larger super-node
 has
 2.4x more children but takes 17x longer to transverse.
 
 neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 1468486  |
 +--+
 1 rows, 25724 ms
 neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 1468486  |
 +--+
 1 rows, 19763 ms
 
 neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 3472174  |
 +--+
 1 rows, 565448 ms
 neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
 +--+
 | count(r) |
 +--+
 | 3472174  |
 +--+
 1 rows, 337975 ms
 
 Any ideas on this?
 Andrew
 
 On 07/06/2011 09:55 AM, Peter Neubauer wrote:
 Andrew,
 if you upgrade to 1.4.M06, your shell should be able to do Cypher in
 order to count the relationships of a node, not returning them:
 
 start n=(1) match (n)-[r]-(x) return count(r)
 
 and try that several times to see if cold caches are initially
 slowing
 down things.
 
 or something along these lines. In the LS and Neoclipse the output
 and
 visualization will be slow for that amount of data.
 
 Cheers,
 
 /peter neubauer
 
 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer
 
 http://www.neo4j.org   - Your high performance graph
 database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
 party.
 
 
 
 On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net
  wrote:
 I have a graph with roughly 10M nodes. Some of these 

Re: [Neo4j] REST batch support - transaction support for java rest client?

2011-07-07 Thread Michael Hunger
Right those are some of the issues.

So one way would be to specify a tx timeout upfront which automatically rolls 
back the tx (you can just add a kind of timer/TTL to the tx-session object) and 
clears it as well.

Keeping state on the server is always a problem but I don't see a different 
solution for that. But it might worth a try especially if it helps you with 
your concrete scenario.

Michael



Am 07.07.2011 um 15:30 schrieb Patrik Sundberg:

 good idea. i'll ponder it for a bit.
 
 but yes, we clearly need to keep state around, so for REST it'd be carried
 around in session. but on server side I guess you have issues with never
 ending transactions, how to cull them, etc. since it's a stateless
 req/response comm channel. on a permanent channel it's easy to detect
 disconnect and clean up, over http not as easy.
 
 thanks
 
 
 On Thu, Jul 7, 2011 at 12:42 PM, Michael Hunger 
 michael.hun...@neotechnology.com wrote:
 
 But then it would be possible to write a RequestFilter for the Neo4j-Server
 that does start and commit/rollbacks transactions.
 
 I.e. you create a tx object and put it in the session-context if there is
 none and return a tx-token that the filter uses (e.g. as header-field).
 then later you can pull it out again and attach it to the current thread
 (that's the tricky part).
 On commit or rollback you just do that with the tx (after attaching it to
 the thread).
 
 As the RestfulGraphDb and the Filter share the same execution thread this
 could/should work.
 
 I wouldn't want to support that in the neo4j server by default as this
 creates a lot of server-side state that has to be managed.
 
 But if it works out one could publish that as server-extension.
 
 HTH
 
 Michael
 
 Am 07.07.2011 um 13:30 schrieb Patrik Sundberg:
 
 Following up on the topic of transactions for client API.
 
 What is the current plan for some sort of client side API supporting
 transactions?
 
 I'm playing around with some ideas here and the lack of transaction
 support
 in the client API is problematic. I know there's BATCH support in the
 REST
 API which effectively is a transaction, but it doesn't always suit. For
 example I have the following steps that I'd like to accomplish:
 - create a reference node
 - check if a node with a given domain id exist in an index, if it does,
 fail
 - create an entity node for the given domain id
 - add entity node to the index
 - attach entity node to ref node
 - create a node representing a specific version of the entity node
 - attach the version node to the entity node, with some properties on the
 relationships signifying valid time
 
 That should all be considered an atomic operation, all or nothing. Doing
 it
 step by step is very easy and natural with REST API, but trying to roll
 back
 on error is flaky.
 
 I think could batch it, but from a programming style it becomes pretty
 unnatural. Same thing with a plugin for doing the steps. The natural flow
 of
 code client side gets distorted by having to collect a lot of data
 upfront
 and then provide all that data to a method call. It's doable, just
 doesn't
 seem ideal.
 
 Using an embedded db, exposing as some sort of service etc is also
 doable,
 it's just that my domain is graph related and I'm pretty happy with just
 the
 primitives and using a remote server (if I could have transactions).
 Number of clients are quite a few and need to share their data + don't
 all
 run all the time so can't make the client API the embedded api.
 
 I'd think it's not an uncommon situation and many people wishing for a
 support for natural client side transaction API (similar to embedded
 api).
 
 Patrik
 
 
 On Tue, Jul 5, 2011 at 12:27 PM, Patrik Sundberg
 patrik.sundb...@gmail.comwrote:
 
 yeah, harder problem than my first hunch.
 
 sounds like plugins is the way to go for now, hopefully introduction of
 non-rest protocol with same interface as embedded API in 1.5 will
 simplify
 things in the future.
 
 thanks
 
 
 On Mon, Jul 4, 2011 at 11:07 PM, Michael Hunger 
 michael.hun...@neotechnology.com wrote:
 
 Patrick,
 
 I've already thought long and hard about that.
 
 The problem is you can't implement that transparently as you can never
 allow code in a second call rely on data derived from a previous one.
 
 The simplest form that I came up with is a BatchCommand that gets an
 API
 interface injected that allows requests but doesn't return data.
 
 The execution of this Batch command would then return a BatchResult
 with
 all the data acquired during the batch operation.
 
 Another way would be to inject the normal GraphDatabaseService
 interface,
 record the invocations in a first phase and then execute the batch
 command
 again (this time ignoring the inputs but then returning the results)
 but
 this is bad from a usability perspective.
 
 One critical issue is the creation of relationships as they depend on
 the
 correct node-ids of previously created nodes. Jacob already thought
 about
 some means of referring to 

Re: [Neo4j] Add relationships dynamically

2011-07-07 Thread noppanit
Thanks  a lot. :)

--
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Add-relationships-dynamically-tp3149437p3149791.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Add relationships dynamically

2011-07-07 Thread noppanit
But I can have an enum class and implements RelationshipType but doesn't that
mean that I have to define each Relationship before hand. For example, If I
have a text, I know john, and know relationship doesn't exist in
MyRelationshipType (which implements RelationshipType already). How could I
create that in runtime? I think when I want to createRelationship I have to
specific RelationshipType?

Thank you very much,

--
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Add-relationships-dynamically-tp3149437p3149985.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Indexed relationships

2011-07-07 Thread Niels Hoogeveen

I created a wiki page for indexed relationships in the Git repo, see: 
https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships

 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 22:53:05 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships
 
 Could you put these code examples into the Readme for the project or on a 
 wiki page?
 
 Am 07.07.2011 um 22:11 schrieb Niels Hoogeveen:
 
  
  IndexedRelationship and IndexedRelationshipExpander are now in Git. See: 
  https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
  An example:
  class IdComparator implements java.util.ComparatorNode{
   public int compare(Node n1, Node n2){
 long l1 = Long.reverse(n1.getId());
 long l2 = Long.reverse(n2.getId());
 if(l1 == l2) return 0;
 else if(l1  l2) return -1;
 else return 1;
   }
  }static enum RelTypes implements RelationshipType{
   DIRECT_RELATIONSHIP,
   INDEXED_RELATIONSHIP,
  };
  Node indexedNode = graphDb().createNode();
  IndexedRelationship ir = new 
  IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, new 
  IdComparator(), true, indexedNode, graphDb());
  
  Node n1 = graphDb().createNode();
  n1.setProperty(name, n1);
  Node n2 = graphDb().createNode();
  n2.setProperty(name, n2);
  Node n3 = graphDb().createNode();
  n3.setProperty(name, n3);
  Node n4 = graphDb().createNode();
  n4.setProperty(name, n4);
  
  indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
  indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
  ir.createRelationshipTo(n2);
  ir.createRelationshipTo(n4);
  
  IndexedRelationshipExpander re1 = new 
  IndexedRelationshipExpander(graphDb(), Direction.OUTGOING, 
  RelTypes.DIRECT_RELATIONSHIP);
  IndexedRelationshipExpander re2 = new 
  IndexedRelationshipExpander(graphDb(), Direction.OUTGOING, 
  RelTypes.INDEXED_RELATIONSHIP);
  
  for(Relationship rel: re1.expand(indexedNode)){
   System.out.println(rel.getEndNode().getProperty(name));
  }
  for(Relationship rel: re2.expand(indexedNode)){
   System.out.println(re2.getEndNode().getProperty(name));
  }
  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Date: Thu, 7 Jul 2011 16:55:36 +0200
  Subject: Re: [Neo4j] Indexed relationships
  
  
  Hi Michael,I realize that the implementation of IndexedRelationship can in 
  fact support returning relationships, and I have a preliminary version 
  running locally now.The returned relationships can support all methods of 
  the Relationship interface, returning the node pointing to the treeRoot as 
  the startNode, and returning the node set as the key_value as the 
  endNode.All relationship properties will be stored on the KEY_VALUE 
  relationship pointing to the endNode.There is one caveat to this solution, 
  the returned relationships cannot support the getId() method,and will 
  throw an UnsupportedOperationException when being 
  called.IndexedRelationship will implement IterableRelationship.With 
  these changes, it is possible to create an Expander and I am working right 
  now to implement that.Niels
  
  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Date: Thu, 7 Jul 2011 14:46:35 +0200
  Subject: Re: [Neo4j] Indexed relationships
  
  
  Hi Michael,
  
  I haven't yet worked on an example. 
  
  There are tests for the SortedTree implementation, 
  but didn't add those to the IndexedRelationship class, 
  which is simply a wrapper around SortedTree. 
  Having a test would have caught the error 
  that no relationship to the treeNode was created 
  (fixed that bug and pushed it to Git) 
  (note to self: always create a unit test, 
  especially when code seems trivial).
  
  There is no relationship expander that uses this. 
  The RelationshipExpander has a method IterableRelationship expand(Node 
  node) 
  which cannot be supported, since there is no direct relationship from 
  startnode to endnode. 
  Instead there is a path through the index tree. 
  
  It's not possible to support the original relationship-traversal API 
  since the IndexedRelationship class is not a wrapper around a node, 
  but a wrapper around the relationships of a certain RelationshipType in 
  the OUTGOING direction. 
  
  As to the name of the class. 
  It is essentially an indexed relationship, 
  and not just a solution to the densely-connected-node problem. 
  An indexed relationship can also be used to maintain 
  a sorted set of relationships of any size, 
  and can be used to guarantee unicity constraints. Niels
  From: michael.hun...@neotechnology.com
  Date: Thu, 7 Jul 2011 13:27:00 +0200
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Indexed relationships
  
  Good work,
  
  do you have an example ready (and/or some tests that show how it 
  works/is used) ?
  
  In creation, manual traversal and automatic traversal (i.e. is there a 
  

Re: [Neo4j] Indexed relationships

2011-07-07 Thread Michael Hunger
Thanks

Michael

Am 08.07.2011 um 01:19 schrieb Niels Hoogeveen:

 
 I created a wiki page for indexed relationships in the Git repo, see: 
 https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
 
 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 22:53:05 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships
 
 Could you put these code examples into the Readme for the project or on a 
 wiki page?
 
 Am 07.07.2011 um 22:11 schrieb Niels Hoogeveen:
 
 
 IndexedRelationship and IndexedRelationshipExpander are now in Git. See: 
 https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
 An example:
 class IdComparator implements java.util.ComparatorNode{
 public int compare(Node n1, Node n2){
   long l1 = Long.reverse(n1.getId());
   long l2 = Long.reverse(n2.getId());
   if(l1 == l2) return 0;
   else if(l1  l2) return -1;
   else return 1;
 }
 }static enum RelTypes implements RelationshipType{
 DIRECT_RELATIONSHIP,
 INDEXED_RELATIONSHIP,
 };
 Node indexedNode = graphDb().createNode();
 IndexedRelationship ir = new 
 IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, new 
 IdComparator(), true, indexedNode, graphDb());
 
 Node n1 = graphDb().createNode();
 n1.setProperty(name, n1);
 Node n2 = graphDb().createNode();
 n2.setProperty(name, n2);
 Node n3 = graphDb().createNode();
 n3.setProperty(name, n3);
 Node n4 = graphDb().createNode();
 n4.setProperty(name, n4);
 
 indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
 indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
 ir.createRelationshipTo(n2);
 ir.createRelationshipTo(n4);
 
 IndexedRelationshipExpander re1 = new 
 IndexedRelationshipExpander(graphDb(), Direction.OUTGOING, 
 RelTypes.DIRECT_RELATIONSHIP);
 IndexedRelationshipExpander re2 = new 
 IndexedRelationshipExpander(graphDb(), Direction.OUTGOING, 
 RelTypes.INDEXED_RELATIONSHIP);
 
 for(Relationship rel: re1.expand(indexedNode)){
 System.out.println(rel.getEndNode().getProperty(name));
 }
 for(Relationship rel: re2.expand(indexedNode)){
 System.out.println(re2.getEndNode().getProperty(name));
 }
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 16:55:36 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Hi Michael,I realize that the implementation of IndexedRelationship can in 
 fact support returning relationships, and I have a preliminary version 
 running locally now.The returned relationships can support all methods of 
 the Relationship interface, returning the node pointing to the treeRoot as 
 the startNode, and returning the node set as the key_value as the 
 endNode.All relationship properties will be stored on the KEY_VALUE 
 relationship pointing to the endNode.There is one caveat to this solution, 
 the returned relationships cannot support the getId() method,and will 
 throw an UnsupportedOperationException when being 
 called.IndexedRelationship will implement IterableRelationship.With 
 these changes, it is possible to create an Expander and I am working right 
 now to implement that.Niels
 
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 14:46:35 +0200
 Subject: Re: [Neo4j] Indexed relationships
 
 
 Hi Michael,
 
 I haven't yet worked on an example. 
 
 There are tests for the SortedTree implementation, 
 but didn't add those to the IndexedRelationship class, 
 which is simply a wrapper around SortedTree. 
 Having a test would have caught the error 
 that no relationship to the treeNode was created 
 (fixed that bug and pushed it to Git) 
 (note to self: always create a unit test, 
 especially when code seems trivial).
 
 There is no relationship expander that uses this. 
 The RelationshipExpander has a method IterableRelationship expand(Node 
 node) 
 which cannot be supported, since there is no direct relationship from 
 startnode to endnode. 
 Instead there is a path through the index tree. 
 
 It's not possible to support the original relationship-traversal API 
 since the IndexedRelationship class is not a wrapper around a node, 
 but a wrapper around the relationships of a certain RelationshipType in 
 the OUTGOING direction. 
 
 As to the name of the class. 
 It is essentially an indexed relationship, 
 and not just a solution to the densely-connected-node problem. 
 An indexed relationship can also be used to maintain 
 a sorted set of relationships of any size, 
 and can be used to guarantee unicity constraints. Niels
 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 13:27:00 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships
 
 Good work,
 
 do you have an example ready (and/or some tests that show how it 
 works/is used) ?
 
 In creation, manual traversal and automatic traversal (i.e. is there a 
 RelationshipExpander that uses it).
 
 And in the 

Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Niels Hoogeveen

I did a write up on indexed relationships in the Git repo: 
https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
A performance comparison would indeed be great. Anecdotally, I have witnessed 
the difference when trying to load all entries of Dbpedia. With 2.5 G heap 
space, loading becomes problematic after some 70,000 relationships have been 
added to the supernode. With the indexed relationship no such problems arise 
and 1.6 million relationships are easily created without  performance 
degradation. 
Having real performance figures would be nice though.
Niels

 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 22:56:17 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships
 
 Niels could you perhaps write up a blog post detailing the usage (also for 
 your own scenario and how that solution would compare to the naive supernodes 
 with just millions of relationships.
 
 Also I'd like to see a performance comparision of both approaches.
 
 Thanks so much for your work
 
 Michael
 
 Am 07.07.2011 um 22:24 schrieb Niels Hoogeveen:
 
  
  I am glad to see a solution will be provided at the core level. 
  Today, I pushed IndexedRelationships and IndexedRelationshipExpander to 
  Git, see: 
  https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
  This provides a solution to the issue, but is certainly not as fast as a 
  solution in core would be. 
  However, it does solve my issues and as a bonus, indexed relationships can 
  be traversed in sorted order,this is especially pleasant, since I usually 
  want to know only the recent additions of dense relationships.
  Niels
  
  
  Date: Thu, 7 Jul 2011 21:37:26 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships
  
  2011/7/7 Agelos Pikoulas agelos.pikou...@gmail.com
  
  I think its the same problem pattern that been in discussion lately with
  dense nodes or supernodes (check
  http://lists.neo4j.org/pipermail/user/2011-July/009832.html).
  
  Michael Hunger has provided a quick solution to visiting the *few*
  RelationshipTypes on a node that has *millions* of others, utilizing a
  RelationshipExpander with an Index (check
  http://paste.pocoo.org/show/traM5oY1ng7dRQAaf1oV/)
  
  Ideally this would be abstracted  implemented in the core distribution so
  that all API's (including Cypher  tinkerpop Pipes/Gremlin) can use it
  efficiently...
  
  
  Yes, I'm positive that something will be done on a core level to make
  getting relationships of a specific type regardless of the total number of
  relationships fast. In the foreseeable future hopefully.
  
  
  Agelos
  
  On Thu, Jul 7, 2011 at 3:16 PM, Andrew White li...@andrewewhite.net
  wrote:
  
  I use the shell as-is, but the messages.log is reporting...
  
 Physical mem: 3962MB, Heap size: 881MB
  
  My point is that if you ignore caching altogether, why did one run take
  17x longer with only 2.4x more data? Considering this is a rather
  iterative algorithm, I don't see why you would even read a node or
  relationship more than once and thus a cache shouldn't matter at all.
  
  In this particular case, I can't imagine taking 9+ minutes to read a
  mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an
  artifact of Cypher in which it is building a set of Rs before applying
  `count` rather than making count accept an iterable stream.
  
  Andrew
  
  On 07/06/2011 11:33 PM, David Montag wrote:
  Hi Andrew,
  
  How big is your configured Java heap? It could be that all the nodes
  and
  relationships don't fit into the cache.
  
  David
  
  On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net
  wrote:
  
  Here is some interesting stats to consider. First, I split my nodes
  into
  two groups, one node with 1.4M children and the other with 3.4M
  children. While I do see some cache warm-up improvements, the
  transversal doesn't seem to scale linearly; ie the larger super-node
  has
  2.4x more children but takes 17x longer to transverse.
  
  neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 1468486  |
  +--+
  1 rows, 25724 ms
  neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 1468486  |
  +--+
  1 rows, 19763 ms
  
  neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 3472174  |
  +--+
  1 rows, 565448 ms
  neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 3472174  |
  +--+
  1 rows, 337975 ms
  
  Any ideas on this?
  Andrew
  
  On 07/06/2011 09:55 AM, Peter Neubauer wrote:
  Andrew,
  if you upgrade to 1.4.M06, your shell should be able to do Cypher in

Re: [Neo4j] Add relationships dynamically

2011-07-07 Thread noppanit
That's such a fast reply, I'm sorry I was going to delete my previous post. I
didn't read that well. I get it now. Thanks :)

--
View this message in context: 
http://neo4j-user-list.438527.n3.nabble.com/Add-relationships-dynamically-tp3149437p3150037.html
Sent from the Neo4J User List mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Indexed relationships

2011-07-07 Thread Andrew White
I don't know if this is the right place to ask this; but does it support 
a batch insert mode? When I am bulk loading data I don't have Node 
objects to pass around, only node ids.

Thanks,
Andrew

On 07/07/2011 06:19 PM, Niels Hoogeveen wrote:
 I created a wiki page for indexed relationships in the Git repo, see: 
 https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships

 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 22:53:05 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships

 Could you put these code examples into the Readme for the project or on a 
 wiki page?

 Am 07.07.2011 um 22:11 schrieb Niels Hoogeveen:

 IndexedRelationship and IndexedRelationshipExpander are now in Git. See: 
 https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
 An example:
 class IdComparator implements java.util.ComparatorNode{
   public int compare(Node n1, Node n2){
 long l1 = Long.reverse(n1.getId());
 long l2 = Long.reverse(n2.getId());
 if(l1 == l2) return 0;
 else if(l1  l2) return -1;
 else return 1;
   }
 }static enum RelTypes implements RelationshipType{
   DIRECT_RELATIONSHIP,
   INDEXED_RELATIONSHIP,
 };
 Node indexedNode = graphDb().createNode();
 IndexedRelationship ir = new 
 IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, new 
 IdComparator(), true, indexedNode, graphDb());
 
 Node n1 = graphDb().createNode();
 n1.setProperty(name, n1);
 Node n2 = graphDb().createNode();
 n2.setProperty(name, n2);
 Node n3 = graphDb().createNode();
 n3.setProperty(name, n3);
 Node n4 = graphDb().createNode();
 n4.setProperty(name, n4);
 
 indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
 indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
 ir.createRelationshipTo(n2);
 ir.createRelationshipTo(n4);
 
 IndexedRelationshipExpander re1 = new 
 IndexedRelationshipExpander(graphDb(), Direction.OUTGOING, 
 RelTypes.DIRECT_RELATIONSHIP);
 IndexedRelationshipExpander re2 = new 
 IndexedRelationshipExpander(graphDb(), Direction.OUTGOING, 
 RelTypes.INDEXED_RELATIONSHIP);
 
 for(Relationship rel: re1.expand(indexedNode)){
   System.out.println(rel.getEndNode().getProperty(name));
 }
 for(Relationship rel: re2.expand(indexedNode)){
   System.out.println(re2.getEndNode().getProperty(name));
 }
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 16:55:36 +0200
 Subject: Re: [Neo4j] Indexed relationships


 Hi Michael,I realize that the implementation of IndexedRelationship can in 
 fact support returning relationships, and I have a preliminary version 
 running locally now.The returned relationships can support all methods of 
 the Relationship interface, returning the node pointing to the treeRoot as 
 the startNode, and returning the node set as the key_value as the 
 endNode.All relationship properties will be stored on the KEY_VALUE 
 relationship pointing to the endNode.There is one caveat to this solution, 
 the returned relationships cannot support the getId() method,and will 
 throw an UnsupportedOperationException when being 
 called.IndexedRelationship will implement IterableRelationship.With 
 these changes, it is possible to create an Expander and I am working right 
 now to implement that.Niels

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 7 Jul 2011 14:46:35 +0200
 Subject: Re: [Neo4j] Indexed relationships


 Hi Michael,

 I haven't yet worked on an example.

 There are tests for the SortedTree implementation,
 but didn't add those to the IndexedRelationship class,
 which is simply a wrapper around SortedTree.
 Having a test would have caught the error
 that no relationship to the treeNode was created
 (fixed that bug and pushed it to Git)
 (note to self: always create a unit test,
 especially when code seems trivial).

 There is no relationship expander that uses this.
 The RelationshipExpander has a method IterableRelationship  expand(Node 
 node)
 which cannot be supported, since there is no direct relationship from 
 startnode to endnode.
 Instead there is a path through the index tree.

 It's not possible to support the original relationship-traversal API
 since the IndexedRelationship class is not a wrapper around a node,
 but a wrapper around the relationships of a certain RelationshipType in 
 the OUTGOING direction.

 As to the name of the class.
 It is essentially an indexed relationship,
 and not just a solution to the densely-connected-node problem.
 An indexed relationship can also be used to maintain
 a sorted set of relationships of any size,
 and can be used to guarantee unicity constraints. Niels
 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 13:27:00 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Indexed relationships

 Good work,

 do you have an example ready (and/or some tests that 

Re: [Neo4j] Performance issue on nodes with lots of relationships

2011-07-07 Thread Peter Neubauer
Niels,
that sounds fantastic, great work everyone so far!

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Fri, Jul 8, 2011 at 1:27 AM, Niels Hoogeveen
pd_aficion...@hotmail.com wrote:

 I did a write up on indexed relationships in the Git repo: 
 https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
 A performance comparison would indeed be great. Anecdotally, I have witnessed 
 the difference when trying to load all entries of Dbpedia. With 2.5 G heap 
 space, loading becomes problematic after some 70,000 relationships have been 
 added to the supernode. With the indexed relationship no such problems arise 
 and 1.6 million relationships are easily created without  performance 
 degradation.
 Having real performance figures would be nice though.
 Niels

 From: michael.hun...@neotechnology.com
 Date: Thu, 7 Jul 2011 22:56:17 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships

 Niels could you perhaps write up a blog post detailing the usage (also for 
 your own scenario and how that solution would compare to the naive 
 supernodes with just millions of relationships.

 Also I'd like to see a performance comparision of both approaches.

 Thanks so much for your work

 Michael

 Am 07.07.2011 um 22:24 schrieb Niels Hoogeveen:

 
  I am glad to see a solution will be provided at the core level.
  Today, I pushed IndexedRelationships and IndexedRelationshipExpander to 
  Git, see: 
  https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
  This provides a solution to the issue, but is certainly not as fast as a 
  solution in core would be.
  However, it does solve my issues and as a bonus, indexed relationships can 
  be traversed in sorted order,this is especially pleasant, since I usually 
  want to know only the recent additions of dense relationships.
  Niels
 
 
  Date: Thu, 7 Jul 2011 21:37:26 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships
 
  2011/7/7 Agelos Pikoulas agelos.pikou...@gmail.com
 
  I think its the same problem pattern that been in discussion lately with
  dense nodes or supernodes (check
  http://lists.neo4j.org/pipermail/user/2011-July/009832.html).
 
  Michael Hunger has provided a quick solution to visiting the *few*
  RelationshipTypes on a node that has *millions* of others, utilizing a
  RelationshipExpander with an Index (check
  http://paste.pocoo.org/show/traM5oY1ng7dRQAaf1oV/)
 
  Ideally this would be abstracted  implemented in the core distribution 
  so
  that all API's (including Cypher  tinkerpop Pipes/Gremlin) can use it
  efficiently...
 
 
  Yes, I'm positive that something will be done on a core level to make
  getting relationships of a specific type regardless of the total number of
  relationships fast. In the foreseeable future hopefully.
 
 
  Agelos
 
  On Thu, Jul 7, 2011 at 3:16 PM, Andrew White li...@andrewewhite.net
  wrote:
 
  I use the shell as-is, but the messages.log is reporting...
 
     Physical mem: 3962MB, Heap size: 881MB
 
  My point is that if you ignore caching altogether, why did one run take
  17x longer with only 2.4x more data? Considering this is a rather
  iterative algorithm, I don't see why you would even read a node or
  relationship more than once and thus a cache shouldn't matter at all.
 
  In this particular case, I can't imagine taking 9+ minutes to read a
  mear 3.4M nodes (that's only 6k nodes per sec). Perhaps this is just an
  artifact of Cypher in which it is building a set of Rs before applying
  `count` rather than making count accept an iterable stream.
 
  Andrew
 
  On 07/06/2011 11:33 PM, David Montag wrote:
  Hi Andrew,
 
  How big is your configured Java heap? It could be that all the nodes
  and
  relationships don't fit into the cache.
 
  David
 
  On Wed, Jul 6, 2011 at 8:03 PM, Andrew Whiteli...@andrewewhite.net
  wrote:
 
  Here is some interesting stats to consider. First, I split my nodes
  into
  two groups, one node with 1.4M children and the other with 3.4M
  children. While I do see some cache warm-up improvements, the
  transversal doesn't seem to scale linearly; ie the larger super-node
  has
  2.4x more children but takes 17x longer to transverse.
 
  neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |
  +--+
  | 1468486  |
  +--+
  1 rows, 25724 ms
  neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r)
  +--+
  | count(r) |