Re: [Neo4j] Clarification on read and write locks

2011-09-27 Thread Bryce
Yeah I had seen that, this would allow partially for what I was hoping to
do, and is still obviously part of the implementation API.  The problem with
this is that (from my reading of LockReleaser and LockManager) you can't
release a read lock that has been attached to a transaction, otherwise you
will have an exception thrown at the point the transaction is doing its lock
cleanups.

Quick explanation of exactly why this is of interest to me:

When client code gets a node collection they can get an iterator from it.
 While they are iterating over the node collection data structure I would
like to read lock as little of the data structure as possible (e.g. say a
node representing a sub branch of a tree), then when the iteration moves out
of that particular scope release the read lock. that works perfectly
fine.

The problem is that client code can abandon an iterator at any point meaning
that either there needs to be a call made to a node collection cleanup
method, or preferably the cleanup releasing of locks is done based on a
transaction finishing (which happens if you enlist the lock into the
transaction using the lock releaser).

Guess I want the best of both worlds, releasing read locks (and even write
locks) when I am sure within the library code that they aren't needed
anymore, but then having dangling locks auto cleaned up at the end of a
transaction.

Cheers
Bryce

On Tue, Sep 27, 2011 at 10:28 AM, McKinley mckinley1...@gmail.com wrote:

 Bryce,

 Have you looked at LockManage.releaseReadLock(...)?


 https://github.com/neo4j/community/blob/master/kernel/src/main/java/org/neo4j/kernel/impl/transaction/LockManager.java#L143

 You can call that method with a null argument for the transaction which
 will
 take you here:


 https://github.com/neo4j/community/blob/master/kernel/src/main/java/org/neo4j/kernel/impl/transaction/RWLock.java#L190

 Which creates a


 https://github.com/neo4j/community/blob/master/kernel/src/main/java/org/neo4j/kernel/impl/transaction/RWLock.java#L469

 I haven't thought through what you are describing, but I just wanted to
 make
 sure you saw that part of the API.

 Cheers,

 McKinley
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Modelling with neo4j

2011-09-27 Thread Bryce
Following up on the part of this discussion about moving the enhanced api
out of the graph collections module, was meaning to get to this earlier but
got side tracked.

The dependency that IndexedRelationship had on the ComparablePropertyType
which I am assuming is what you are referring to there no longer exists.
 PropertySortedTree still has a dependency on ComparablePropertyType but
this collection could be taken along with the enhanced api as a
specialisation of the SortedTree collection specific to the enhanced api.
 As it still implements NodeCollection it can be used as the basis of an
IndexedRelationship, but IndexedRelationship doesn't need to know about it
at all.

I am wondering however whether there is a problem with this as one thing I
just realised isn't happening that previously did is the storage of
the property type into the node collection, and for that matter
PropertySortedTree currently has no node only constructor so wouldn't
currently work correctly I will look into that (hadn't done that since I
haven't even had a good look at the enhanced api yet).

On Sun, Sep 25, 2011 at 3:28 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 +1
 Enhanced API grew out of a couple of classes I added to make
 IndexedRelationship work more easily (not exposing comparators), but it is
 essentially a separate component. Giving it that status would help other's
 improve it. Having laid some of the ground work, I feel it needs other
 people's input too. As it stands now, it is very much a one-man's work and
 while I am confident it contains plenty of good ideas, it can only grow with
 the input of other developers, just like IndexedRelationships has become
 much better thanks to the work Bryce put into it, and the work of others to
 include graph-collections with structures I would not even have thought
 about.
 There is however one thing we need to look at. Right now IndexRelationships
 has a dependency on Enhanced API for the indexing of nodes based on a
 property. At the same time Enhanced API has a dependency on
 graph-collections, transparently supporting IndexedRelationships in the API.
 I think it would be best to remove the dependency of graph-collections on
 enhanced-api and only offer the slightly more complex option where the user
 needs to provide a comparator. The other dependency can remain and in fact
 can even be made stronger. Enhanced API could in principle be made to
 support any type of collection, now that Bryce has added a generic
 nodecollection interface.
 I agree enhanced api is not a great name, it says what it does, but
 certainly has little appeal. So I will be happy if someone can come up with
 something sexier.
 Niels
  From: peter.neuba...@neotechnology.com
  Date: Sat, 24 Sep 2011 15:42:13 +0200
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Modelling with neo4j
 
  Great thoughts guys!
  I think it would be interesting to break out the Enhanced API from
  graph-collections, rename it into something better (we can think of a
  name together) and provide a more fully fledged example that we can
  document and evolve.
 
  WDYT?
 
  Cheers,
 
  /peter neubauer
 
  GTalk:  neubauer.peter
  Skype   peter.neubauer
  Phone   +46 704 106975
  LinkedIn   http://www.linkedin.com/in/neubauer
  Twitter  http://twitter.com/peterneubauer
 
  http://www.neo4j.org   - Your high performance graph
 database.
  http://startupbootcamp.org/- Öresund - Innovation happens HERE.
  http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
 
 
 
  On Sat, Sep 24, 2011 at 3:37 PM, Rick Bullotta
  rick.bullo...@thingworx.com wrote:
   That's a great summary, Niels.  Very similar to how we've applied Neo4J
 here at ThingWorx, though we've done most of the type system work (nodes and
 relationships are all typed/subtyped) in our application domain layer.  A
 few other items that we leveraged in our implementation that you may wish to
 consider:
  
   - A common pattern we encountered was a collection of typed entities
 (e.g. a typed collection), and we implemented a specific model using
 supernodes for this.  This also allowed us to rapidly and easily
 iterate/search collections and also to organize nodes in a human
 comprehensible way that can be readily viewed with something like Neoclipse
 for troubleshooting purposes.  Also, if the type was truck, we stamped the
 node with the type truck as a property (using enumerations with a custom
 int member) and used that same enum as the relationship type between the
 node and the collection node.  In our model, an entity has a single type,
 but we implemented the concept of supertyping/subtyping in our domain model
  
   - We found quite a few examples where a one-way relationship was more
 than adequate and, instead of incurring the overhead of a relationship
 (particularly when millions of these relationships were attached to a single
 supernode), we used a *property* on a node containing the node

Re: [Neo4j] Clarification on read and write locks

2011-09-26 Thread Bryce
Hi Tobias,

Thanks for the response, these are exactly the kind of details that I was
wanting to get a better understanding of.  Would be good to get a public api
lock system working, though I know you guys have lots of time pressures :-)

From what you have said my understanding is that there is no supported
public api way of getting a read lock at present.  Doing the best with the
tools at hand at present, and understanding this could break in the future
if the implementation changes, would the following fix the acquireLock
method I posted earlier:
private void acquireLock( LockType lockType )
{
GraphDatabaseService graphDb = baseNode.getGraphDatabase();
if ( lockType == LockType.READ  graphDb instanceof
AbstractGraphDatabase )
{
Config config = ((AbstractGraphDatabase)
baseNode.getGraphDatabase()).getConfig();
config.getLockManager().getReadLock( baseNode );
config.getLockReleaser().addLockToTransaction( baseNode,
LockType.READ );
}
else {
// default to write lock if read locks unavailable
baseNode.removeProperty( ___dummy_property_to_acquire_lock___
);
}
}

My thoughts on the questions you have posed would be:

1) Should there be release() methods for locks as well? or should
locks always be tied to the transaction?
I like the idea of having the flexibility of locking a certain node for only
a part of a transaction, especially in the case of read locks. write
locks I am not so sure about, maybe being able to release a write lock only
if you didn't actually change the locked element.

I am not sure if you have looked at the graph collections stuff, and
particularly the UnrolledLinkedList implementation I recently added, but
that is the specific situation I am thinking of.  My thinking of how an
ideal locking situation would work here would be, for reading:

   - get read lock on the node representing the first page of the ULL
   - iterate over the value relationships associated with this page
   - get read lock on the node representing the second page
   - release the read lock on the first page
   - any read locks left at the end would need to be tied to transaction,
   therefore would need to be able to attach and detach locks from transactions

For writing:

   - get write lock on the base node (representing the entry point to the
   data structure)
   - do processing to find where the write is going to occur within the data
   structure (i.e. which page(s) are changed)
   - get write lock on these page nodes
   - release write lock on base node (since it hasn't changed then this
   would be instant release)
   - change the page nodes
   - the write locks associated with the page nodes would be released with
   the transaction commit

With this as little of the data structure is being locked at a given
time but enough to stop concurrency issues (I think, this is all off the
top of my head right now).

2) Where should the methods live? On the entities themselves?
i.e. node.acquireReadLock() / node.aquireWriteLock() or on the Transaction
object to signal that locks are tied to the life cycle of the transaction?
i.e. tx.readLock(node) / tx.writeLock(node)

Bit less sure about this but something like this might work:
Lock lock = node.acquireReadLock(); // or node.aquireWriteLock()
tx.addLock(lock); // now lock is tied to transaction
tx.removeLock(lock); // no longer tied to transaction

Cheers
Bryce

On Tue, Sep 27, 2011 at 12:12 AM, Tobias Ivarsson 
tobias.ivars...@neotechnology.com wrote:

 On Mon, Sep 26, 2011 at 2:28 AM, Bryce bryc...@gmail.com wrote:

  Is there any reason that access to the lock manager is a little
 difficult?
   Is there any issue with using it (both single server and HA)?  Are their
  any issues to look out for?  And does the above code look workable?
 

 The reason the LockManager is hidden away is because it is considered an
 implementation detail.
 We reserve the right to change it at any point without prior notice.
 From one version to another, the API of this class could potentially look
 completely different, or have very different semantics.
 If you write code that depend on implementation internal classes it will at
 best cause you compilation errors when upgrading to a later version of
 Neo4j, at worst break in subtle ways after you've deployed your
 application.

 There is another important thing to note about your sample code:

 On Mon, Sep 26, 2011 at 2:28 AM, Bryce bryc...@gmail.com wrote:
 
 private void acquireLock( LockType lockType )
 {
 GraphDatabaseService graphDb = baseNode.getGraphDatabase();
 if ( lockType == LockType.READ  graphDb instanceof
  AbstractGraphDatabase )
 {
 ((AbstractGraphDatabase)
  baseNode.getGraphDatabase()).getConfig().getLockManager().getReadLock(
  baseNode );
 }
 else {
 // default to write lock if read locks unavailable

Re: [Neo4j] add property to all relationships

2011-09-26 Thread Bryce
I do similar processing a lot, i.e. changes across the entire graph database
(either relationships or nodes).  Code along the lines of:

public void process() {
int count = 0;
Transaction tx = null;
for (Node node : graphDb.getAllNodes()) {
if ((count % 1) == 0) {
if (tx != null) {
System.err.println(Processed:  + count);
tx.success();
tx.finish();
}
tx = graphDb.beginTx();
}

// do some processing
}
if (tx != null) {
tx.success();
tx.finish();
}
}

will split processing of all nodes up into transaction batches of 10,000
changes.

On Mon, Sep 26, 2011 at 11:37 PM, sometime dons...@gmail.com wrote:

 Can you please tell me where to read about transaction batches ?

 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/add-property-to-all-relationships-tp3368779p3368893.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversal performance

2011-09-26 Thread Bryce
One initial suggestion would be that your memory mapped settings are
probably not very near optimal.  If you have a look at the file sizes in
your graph data directory then the closer you can get to covering each db
files entire size the better.  I would assume that some of the files will be
bigger than others and in fact you will probably find a few of them are
very small, so you are wasting memory on them that you could assign to
another memory mapping.

So in one of mine I have:
5807428  neostore.nodestore.db
335536170  neostore.relationshipstore.db
398675470  neostore.propertystore.db
1208  neostore.propertystore.db.index
6906  neostore.propertystore.db.index.keys
1112428784  neostore.propertystore.db.strings
158  neostore.propertystore.db.arrays

In which case there is no point in me assigning much if any memory to:
neostore.propertystore.db.arrays.mapped_memory
neostore.propertystore.db.index.keys.mapped_memory
neostore.propertystore.db.index.mapped_memory

The other thing to take into account is that the
neostore.nodestore.db.mapped_memory and
neostore.relationshipstore.db.mapped_memory
settings have a lot more impact on traversal than the property story
settings.  The property store settings will help when you are reading
properties from nodes or relationships.

So if you can assign memory mapping settings for nodes and relationships to
fit it all in memory map that would be good, otherwise still best to assign
more to those, and definitely don't give the ones like arrays much memory
(unless you are using them a lot).

On Tue, Sep 27, 2011 at 12:52 PM, Rick rick.devin...@gmail.com wrote:

 Looking for help on how to tune traversals, this is a great product with
 the
 best API and I want to make sure Im getting the most from it.  I'm trying
 to
 understand if 62,500 traversals per second is the best I can do given the
 following scenario:

 - 15.6M nodes
 - 15.6M relationships
 - Data is structured as shown below so that the root has 250 children, each
 of its children have 250 children, and each of their children have 250
 children
 - If i get the entire list of children and grandchildren for a top node
 (max
 3 levels deep), I get 62,500 nodes, and this takes about 800-1000ms
 - The server is a dual core quad 3.2ghz Xeon with 16gb ram
 - The neo4j.props settings are:

 neostore.nodestore.db.mapped_memory=1G
 neostore.relationshipstore.db.mapped_memory=1G
 neostore.propertystore.db.mapped_memory=1G
 neostore.propertystore.db.index.mapped_memory=1G
 neostore.propertystore.db.index.keys.mapped_memory=1G
 neostore.propertystore.db.strings.mapped_memory=1G
 neostore.propertystore.db.arrays.mapped_memory=1G


 - The code that does the traversal is

Traverser trav = user.traverse( Order.BREADTH_FIRST,
 new StopEvaluator()
 { public boolean isStopNode(TraversalPosition pos) { return pos.depth() =
 3; }  },
new ReturnableEvaluator() { public
 boolean
 isReturnableNode(TraversalPosition pos) { return pos.depth() 3; }
   }, KNOWS, Direction.BOTH );

for ( Node node : trav )
{
// Do something with node...
i++;
}


 Data example
  root
node 0-0-0
  node 0-0-1
  node 0-0-2
  ...
   node 0-1-0
  node 0-1-1
  node 0-1-2
  ...



 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Traversal-performance-tp3371038p3371038.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversal performance

2011-09-26 Thread Bryce
It wont make any difference if the memory mapping settings are just larger
than the file sizes, or a lot larger therefore fiddling with those
settings wont make any difference from your original test.

Generally when people see very high performance it is because a lot of the
data they are traversing over is already in memory, i.e. the caches are
warmed.  So is this test you are running just from a cold start, and if so
can you try the test twice, within the same vm that is.

On Tue, Sep 27, 2011 at 3:48 PM, Rick Devinsus rick.devin...@gmail.comwrote:

 I took a look at the files and none were larger than 500MB, however it
 makes
 a lot of sense to change the memory as you suggested so I altered the
 options as shown below.  I also started eclipse with different memory
 options than the defaults (eclipse -vmargs -Xmx2000m -server).  The changes
 didn't make it any faster though.

 I had read about people getting 2M traversals per second, since I'm only
 seeing around 65000/sec  I'm starting to think that represented the number
 of nodes searched through not the number returned based on the traversal's
 criteria.


 neostore.nodestore.db.mapped_memory=1.5G
 neostore.relationshipstore.db.mapped_memory=1.5G
 neostore.propertystore.db.mapped_memory=1.5G
 neostore.propertystore.db.index.mapped_memory=1.5G
 neostore.propertystore.db.index.keys.mapped_memory=50M
 neostore.propertystore.db.strings.mapped_memory=50M
 neostore.propertystore.db.arrays.mapped_memory=50M

 my file sizes:
 neostore.relationshipstore.db 500MB
 neostore.propertystore.db 383MB
 neostore.nodestore.db 137MB
 (others are all less than 1MB)
 the largest lucene node is 367MB

 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Traversal-performance-tp3371038p3371379.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Clarification on read and write locks

2011-09-25 Thread Bryce
Hi,

I am just trying to clarify my understanding of read and write locks.  There
has been discussions around getting a write lock on a given node by removing
a non-existent property from the node, which in turn triggers
a nodeManager.acquireLock which calls lockManager.getWriteLock.

The only other option I see is casting the GraphDatabaseService to an
AbstractGraphDatabase then
calling getConfig().getLockManager().getWriteLock( node ).

I am guessing the main reason for the removing non-existent property hack
is ease of use write locking?

From the lock manager you can also get a read lock, which would come in
handy, but since access to this is through AbstractGraphDatabase I would
have to check the graph db instance to make sure I can cast it, then have if
/ else code handling the different cases, e.g.:
private void acquireLock( LockType lockType )
{
GraphDatabaseService graphDb = baseNode.getGraphDatabase();
if ( lockType == LockType.READ  graphDb instanceof
AbstractGraphDatabase )
{
((AbstractGraphDatabase)
baseNode.getGraphDatabase()).getConfig().getLockManager().getReadLock(
baseNode );
}
else {
// default to write lock if read locks unavailable
baseNode.removeProperty( ___dummy_property_to_acquire_lock___
);
}
}

Is there any reason that access to the lock manager is a little difficult?
 Is there any issue with using it (both single server and HA)?  Are their
any issues to look out for?  And does the above code look workable?

Cheers
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Clarification on read and write locks

2011-09-25 Thread Bryce
I am looking at this in relation to an graph collections class,
UnrolledLinkedList.  Since this is a library class that others might use it
has to rely on the top level interface, or more correctly it has to be able
to work off being passed a Node.

Since Node has getGraphDatabase() returning a GraphDatabaseService that is
what I have to work with.

On Mon, Sep 26, 2011 at 2:44 PM, McKinley mckinley1...@gmail.com wrote:

 Bryce,

 The situation you are facing seems to only be a problem with choosing the
 type of the interface for your variable instead of the type that implements
 the  getConfig method. For example:

 EmbeddedGraphDatabase graphDb = new EmbeddedGraphDatabase(data-dir);
 lockManager = graphDb.getConfig().getLockManager();

 That will work straight without any type checks needed. If you have a need
 to only rely on GraphDatabaseService in other parts of your code to provide
 a common base for both EmbeddedGraphDatabase and something that is
 not EmbeddedGraphDatabase then you coud perhaps easier assign:

 GraphDatabaseService graphDbService = graphDb;

 Cheers,

 McKinley
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unrolled Linked List

2011-09-25 Thread Bryce
Thad, there is some good advice contained within those links, but it doesn't
really fit the situation that I am discussing.  The locks I am talking about
are locks on data within the graph database, not java programmatic locks
(though in a non-HA environment java locks could well be used as an
optimisation).  Think about this as row level locks in a database.

The main difference is gaining a lock on a given node within neo4j means
that the thread that has the lock will keep the lock for the entirety of the
current transaction.  I think there is no way to remove the lock earlier, as
far as I am aware, it could be done theoretically if neo4j supported nested
transactions (but presently it doesn't) -- again I am not 100% sure on this
point.

I have however realised I had missed the read locks (or at least forgotten
about them) that neo4j lock manager offers.  So most of what I was talking
about in regards to pseudo read locks is solved by actual read locks :-)

On Sat, Sep 24, 2011 at 2:33 PM, Thad Guidry thadgui...@gmail.com wrote:

 Non-blocking has a new meaning compared to the old ways:
 http://en.wikipedia.org/wiki/Non-blocking_algorithm

 So..

 Try to Reduce Lock Contention, instead of getting rid of them entirely:

 http://www.thinkingparallel.com/2007/07/31/10-ways-to-reduce-lock-contention-in-threaded-programs/

 But also do not forgot the ways of old, which are now buried deep in our
 most useful java libraries.
 http://java.sun.com/developer/technicalArticles/Programming/Performance/

 On Fri, Sep 23, 2011 at 9:09 PM, Bryce bryc...@gmail.com wrote:

  As long as all access to these nodes is through the ULL class then the
 lock
  at the base node for writes, giving a datastructure wide lock, should
 make
  that threadsafe (i.e. only one writer can lock the base at a given time,
  therefore only one writer can be adjusting the item count at a given
 time).
 
  Though in this case the lock can be gained against the page node very
  easily
  just by changing the order of the code:
 
  Node page = checkSplitNode( getPage( node ), node );
  page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT ))
 +
  1
  );
  return page.createRelationshipTo( node,
  GraphCollection.RelationshipTypes.VALUE );
 
  To:
 
  Node page = checkSplitNode( getPage( node ), node );
  Relationship relationship = page.createRelationshipTo( node,
  GraphCollection.RelationshipTypes.VALUE );
  page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT ))
 +
  1
  );
  return relationship;
 
  Where from my understanding the creation of the relationship from the
 page
  node will lock that node therefore making the subsequent line thread safe
  even without the global base node lock.
 
  I could live easily with having to have a transaction for reading over
 the
  collection, as long as multiple readers could be in the collection at the
  same time.  I am trying to think of a way of doing this, basically at
  present my thinking is along the lines of:
 
1. multiple read lock nodes are attached to the base node (say N)
2. when a thread accesses the collection for read access it locks one
 of
these nodes (by removing non-existent property)
3. therefore, as long as each read thread locks a different node, N
reading threads can be in at once
4. for a write to occur the writer first locks the base node, then
 locks
each of the read lock nodes meaning that neither readers or writers can
  get
their required lock
 
  The difficulty comes in point 3 above, how to have threads lock different
  nodes?  To do this accurately you would need to be able to check whether
  someone else has a lock on a given node at a given time.  A simple half
  working option would be having say 20 read lock nodes hanging off the
 base
  node, and randomly choose one.  Would mean there is a good chance that
  multiple readers could run at once.
 
  On Sat, Sep 24, 2011 at 10:20 AM, Niels Hoogeveen 
  pd_aficion...@hotmail.com
   wrote:
 
  
   Read integrity is really a dog. We haven't even begun to address that
 in
   the other collections.
   With regards to write locks ( and this is something we should check in
   sortedtree too ) is code like:
page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT
 ))
  +
   1 );
   This is only threadsafe if the value returned by page.getProperty(
   ITEM_COUNT ) is read from a locked node.
   Niels
  
Date: Sat, 24 Sep 2011 09:14:07 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Unrolled Linked List
   
For writing that works well, and in fact for add node I am doing that
   (just
realised I am not for remove node but I should be).  The problems for
reading are:
   
   - it should allow multiple threads to read at the same time
   - it shouldn't dictate that client code has a transaction in order
  to
   read
   
As a simple solution thats probably workable (and probably

Re: [Neo4j] HyperRelationship example

2011-09-24 Thread Bryce
Here you go:
https://github.com/neo4j/graph-collections/wiki/HyperRelationship-example

Though that page just has a link to:
https://github.com/neo4j/graph-collections/wiki/Enhanced-API

Bryce

On Sat, Sep 24, 2011 at 5:00 PM, loldrup lold...@gmail.com wrote:


 Niels Hoogeveen wrote:
 
  I just posted an example on how to use HyperRelationships:
 
 
 https://github.com/peterneubauer/graph-collections/wiki/HyperRelationship-example
 

 This link now gives 404. Does it have a new address? If so, what is it?


 Jon

 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-HyperRelationship-example-tp3204449p3363779.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unrolled Linked List

2011-09-23 Thread Bryce
For writing that works well, and in fact for add node I am doing that (just
realised I am not for remove node but I should be).  The problems for
reading are:

   - it should allow multiple threads to read at the same time
   - it shouldn't dictate that client code has a transaction in order to
   read

As a simple solution thats probably workable (and probably the safest), and
means that HA will just work, but restricting one thread at a time into a
given node collection isn't the best.

Maybe the client code should set whether it locks the data structure when
reading, or fails with a ConcurrentModificationException when reading and
data is changed.

On Sat, Sep 24, 2011 at 6:00 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 A quick skim of the code shows me you have a baseNode which is an
 entrypoint for the ULL. This is a logical candidate node to use for the
 purpose of locking.
 What are the pros and cons to locking the baseNode on every read and write
 operation?
 Niels
  Date: Fri, 23 Sep 2011 09:39:38 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Unrolled Linked List
 
  Good stuff.
 
  I am presently looking into concurrent use of a given UnrolledLinkedList
 at
  least within the same graph database instance, might be a little bit
 harder
  in HA environment.  Its hard enough writing test cases for this, maybe
 even
  harder than making it work properly!  Hoping that some utility code I am
  going to produce will help with testing concurrency of other data
  structures.
 
  By concurrent use I mean concurrent use of the data within the graph, not
 of
  the given instantiation of the class, e.g. what happens when one thread
 gets
  an instance of ULL based off a given node and is iterating over it, then
  another thread gets an instance of a ULL and writes into it.
 
  Cheers
  Bryce
 
  On Fri, Sep 23, 2011 at 4:57 AM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   It looks really cool.
   I always find it fun to create something and later find out it is an
   already known construction (something worth inventing).
   Anyway, I pulled your code and will removed the dependencies to the
   Enhanced API stuff this week. After that we can start adding some
   documentation.
   Niels
  
Date: Thu, 22 Sep 2011 15:57:13 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: [Neo4j] Unrolled Linked List
   
Hi all,
   
I have added an in graph representation of an unrolled linked list to
 the
graph collections code, currently just in my githug repo:
https://github.com/brycenz/graph-collections
   
See this in particular:
   
  
 https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java
   
The name comes from:
http://en.wikipedia.org/wiki/Unrolled_linked_list
   
And it works roughly in the same manner, though I had the idea prior
 to
reading the wiki article.
   
As the UnrolledLinkedList class implements the NodeCollection
 interface
   it
can be used as the backing of an IndexedRelationship, which is done
 in
   tests
here:
   
  
 https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java
   
The main reason for me being interested in this, and an example of
 where
this is (probably) really useful is in the following case:
   
   - you have a number of tag (or category, folder etc.) nodes
   - they each link to a large number of document (or article,
 comments,
   post etc.) nodes
   - using a single relationship type
   - you generally only are interested in showing the newest
 documents in
   descending date order (showing the head, in a paged ui)
   - documents are generally added in ascending date order (added to
 the
   head)
   
The benefits come from being able to iterate over a small percentage
 of a
collection of nodes in a fixed order without having to first load all
 the
nodes and sort them.  This reduces the amount of data read in from
 disk,
reduces the turnover of data in memory, and therefore aids with
 reduction
   in
garbage collection.  In my case I have a large number of tags with a
   large
number of items against them, I might read the first 100-200 items
 out of
   a
collection of 30,000 and therefore by not reading in the other 29800
relationships / nodes (per tag) I should be saving 90% or more.
   here's
hoping.
   
From the java doc: The structure is broken into pages of links to
 nodes
where the size of the page can be controlled at initial construction
   time.
Page size is not fixed but instead can float between a lower bound,
 and
   an
upper bound. The bounds are at a fixed margin from the page size of
 M.
   When
a page drops below the lower bound it will be joined onto

Re: [Neo4j] Unrolled Linked List

2011-09-23 Thread Bryce
As long as all access to these nodes is through the ULL class then the lock
at the base node for writes, giving a datastructure wide lock, should make
that threadsafe (i.e. only one writer can lock the base at a given time,
therefore only one writer can be adjusting the item count at a given time).

Though in this case the lock can be gained against the page node very easily
just by changing the order of the code:

Node page = checkSplitNode( getPage( node ), node );
page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT )) + 1
);
return page.createRelationshipTo( node,
GraphCollection.RelationshipTypes.VALUE );

To:

Node page = checkSplitNode( getPage( node ), node );
Relationship relationship = page.createRelationshipTo( node,
GraphCollection.RelationshipTypes.VALUE );
page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT )) + 1
);
return relationship;

Where from my understanding the creation of the relationship from the page
node will lock that node therefore making the subsequent line thread safe
even without the global base node lock.

I could live easily with having to have a transaction for reading over the
collection, as long as multiple readers could be in the collection at the
same time.  I am trying to think of a way of doing this, basically at
present my thinking is along the lines of:

   1. multiple read lock nodes are attached to the base node (say N)
   2. when a thread accesses the collection for read access it locks one of
   these nodes (by removing non-existent property)
   3. therefore, as long as each read thread locks a different node, N
   reading threads can be in at once
   4. for a write to occur the writer first locks the base node, then locks
   each of the read lock nodes meaning that neither readers or writers can get
   their required lock

The difficulty comes in point 3 above, how to have threads lock different
nodes?  To do this accurately you would need to be able to check whether
someone else has a lock on a given node at a given time.  A simple half
working option would be having say 20 read lock nodes hanging off the base
node, and randomly choose one.  Would mean there is a good chance that
multiple readers could run at once.

On Sat, Sep 24, 2011 at 10:20 AM, Niels Hoogeveen pd_aficion...@hotmail.com
 wrote:


 Read integrity is really a dog. We haven't even begun to address that in
 the other collections.
 With regards to write locks ( and this is something we should check in
 sortedtree too ) is code like:
  page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT )) +
 1 );
 This is only threadsafe if the value returned by page.getProperty(
 ITEM_COUNT ) is read from a locked node.
 Niels

  Date: Sat, 24 Sep 2011 09:14:07 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Unrolled Linked List
 
  For writing that works well, and in fact for add node I am doing that
 (just
  realised I am not for remove node but I should be).  The problems for
  reading are:
 
 - it should allow multiple threads to read at the same time
 - it shouldn't dictate that client code has a transaction in order to
 read
 
  As a simple solution thats probably workable (and probably the safest),
 and
  means that HA will just work, but restricting one thread at a time into a
  given node collection isn't the best.
 
  Maybe the client code should set whether it locks the data structure when
  reading, or fails with a ConcurrentModificationException when reading and
  data is changed.
 
  On Sat, Sep 24, 2011 at 6:00 AM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   A quick skim of the code shows me you have a baseNode which is an
   entrypoint for the ULL. This is a logical candidate node to use for the
   purpose of locking.
   What are the pros and cons to locking the baseNode on every read and
 write
   operation?
   Niels
Date: Fri, 23 Sep 2011 09:39:38 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Unrolled Linked List
   
Good stuff.
   
I am presently looking into concurrent use of a given
 UnrolledLinkedList
   at
least within the same graph database instance, might be a little bit
   harder
in HA environment.  Its hard enough writing test cases for this,
 maybe
   even
harder than making it work properly!  Hoping that some utility code I
 am
going to produce will help with testing concurrency of other data
structures.
   
By concurrent use I mean concurrent use of the data within the graph,
 not
   of
the given instantiation of the class, e.g. what happens when one
 thread
   gets
an instance of ULL based off a given node and is iterating over it,
 then
another thread gets an instance of a ULL and writes into it.
   
Cheers
Bryce
   
On Fri, Sep 23, 2011 at 4:57 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:
   

 It looks really cool.
 I always find it fun

Re: [Neo4j] Unrolled Linked List

2011-09-22 Thread Bryce
Good stuff.

I am presently looking into concurrent use of a given UnrolledLinkedList at
least within the same graph database instance, might be a little bit harder
in HA environment.  Its hard enough writing test cases for this, maybe even
harder than making it work properly!  Hoping that some utility code I am
going to produce will help with testing concurrency of other data
structures.

By concurrent use I mean concurrent use of the data within the graph, not of
the given instantiation of the class, e.g. what happens when one thread gets
an instance of ULL based off a given node and is iterating over it, then
another thread gets an instance of a ULL and writes into it.

Cheers
Bryce

On Fri, Sep 23, 2011 at 4:57 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 It looks really cool.
 I always find it fun to create something and later find out it is an
 already known construction (something worth inventing).
 Anyway, I pulled your code and will removed the dependencies to the
 Enhanced API stuff this week. After that we can start adding some
 documentation.
 Niels

  Date: Thu, 22 Sep 2011 15:57:13 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: [Neo4j] Unrolled Linked List
 
  Hi all,
 
  I have added an in graph representation of an unrolled linked list to the
  graph collections code, currently just in my githug repo:
  https://github.com/brycenz/graph-collections
 
  See this in particular:
 
 https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java
 
  The name comes from:
  http://en.wikipedia.org/wiki/Unrolled_linked_list
 
  And it works roughly in the same manner, though I had the idea prior to
  reading the wiki article.
 
  As the UnrolledLinkedList class implements the NodeCollection interface
 it
  can be used as the backing of an IndexedRelationship, which is done in
 tests
  here:
 
 https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java
 
  The main reason for me being interested in this, and an example of where
  this is (probably) really useful is in the following case:
 
 - you have a number of tag (or category, folder etc.) nodes
 - they each link to a large number of document (or article, comments,
 post etc.) nodes
 - using a single relationship type
 - you generally only are interested in showing the newest documents in
 descending date order (showing the head, in a paged ui)
 - documents are generally added in ascending date order (added to the
 head)
 
  The benefits come from being able to iterate over a small percentage of a
  collection of nodes in a fixed order without having to first load all the
  nodes and sort them.  This reduces the amount of data read in from disk,
  reduces the turnover of data in memory, and therefore aids with reduction
 in
  garbage collection.  In my case I have a large number of tags with a
 large
  number of items against them, I might read the first 100-200 items out of
 a
  collection of 30,000 and therefore by not reading in the other 29800
  relationships / nodes (per tag) I should be saving 90% or more.
 here's
  hoping.
 
  From the java doc: The structure is broken into pages of links to nodes
  where the size of the page can be controlled at initial construction
 time.
  Page size is not fixed but instead can float between a lower bound, and
 an
  upper bound. The bounds are at a fixed margin from the page size of M.
 When
  a page drops below the lower bound it will be joined onto the an adjacent
  page, and when the page goes above the upper bound it will be split in
 half.
 
  I am about to do some tests with this based on my use case and will
 report
  back on the performance impacts.
 
  Cheers
  Bryce
 
  P.S. still thinking about how to make this thread safe, any suggestions
  would be appreciated (presently only one thread will be able to write at
 a
  time, I am worried about a thread reading while another is writing,
  especially when it joins / splits pages or changes the head).
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Unrolled Linked List

2011-09-21 Thread Bryce
Hi all,

I have added an in graph representation of an unrolled linked list to the
graph collections code, currently just in my githug repo:
https://github.com/brycenz/graph-collections

See this in particular:
https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java

The name comes from:
http://en.wikipedia.org/wiki/Unrolled_linked_list

And it works roughly in the same manner, though I had the idea prior to
reading the wiki article.

As the UnrolledLinkedList class implements the NodeCollection interface it
can be used as the backing of an IndexedRelationship, which is done in tests
here:
https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java

The main reason for me being interested in this, and an example of where
this is (probably) really useful is in the following case:

   - you have a number of tag (or category, folder etc.) nodes
   - they each link to a large number of document (or article, comments,
   post etc.) nodes
   - using a single relationship type
   - you generally only are interested in showing the newest documents in
   descending date order (showing the head, in a paged ui)
   - documents are generally added in ascending date order (added to the
   head)

The benefits come from being able to iterate over a small percentage of a
collection of nodes in a fixed order without having to first load all the
nodes and sort them.  This reduces the amount of data read in from disk,
reduces the turnover of data in memory, and therefore aids with reduction in
garbage collection.  In my case I have a large number of tags with a large
number of items against them, I might read the first 100-200 items out of a
collection of 30,000 and therefore by not reading in the other 29800
relationships / nodes (per tag) I should be saving 90% or more. here's
hoping.

From the java doc: The structure is broken into pages of links to nodes
where the size of the page can be controlled at initial construction time.
Page size is not fixed but instead can float between a lower bound, and an
upper bound. The bounds are at a fixed margin from the page size of M. When
a page drops below the lower bound it will be joined onto the an adjacent
page, and when the page goes above the upper bound it will be split in half.

I am about to do some tests with this based on my use case and will report
back on the performance impacts.

Cheers
Bryce

P.S. still thinking about how to make this thread safe, any suggestions
would be appreciated (presently only one thread will be able to write at a
time, I am worried about a thread reading while another is writing,
especially when it joins / splits pages or changes the head).
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-20 Thread Bryce
Hi Niels,

Probably is a good idea.  I will try to get something done around that soon,
flat out with work issues/features at present (including a nice
concurrency bug, argh).

Cheers
Bryce

On Wed, Sep 21, 2011 at 2:01 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Hi Bryce,
 Sorry for the late response.
 I understand it's difficult to come up with a really good use-case for
 making NodeCollection more general in the context of IndexedRelationships,
 but I like to think of that interface as something we can eventually use for
 all sorts of collections, not just the ones derived from SortedTree.
 There is of course the issue that relationships can not attach to
 relationships, so collections of relationships will need to be addressed by
 Id. This is not necessarily a bad thing, because it decouples the container
 and the elements. In other words the container knows what elements it
 contains, but the elements don't know in what containers they are placed.
 Another option would be to create shadow nodes for contained relationships.
 Instead of adding a relationships to the collection, its shadow node is
 added and both the shadow node and the relationship contain pointers
 (properties with Id values) towards each other.
 I think it would be best if we do indeed create a GraphCollection interface
 parameterized by T extends PropertyContainer  even if that type parameter
 for now is always a Node. It doesn't add much complexity now to do it, and
 later on we may regret it and by then it becomes harder to do because there
 is an installed base.
 Niels

  Date: Sat, 17 Sep 2011 14:19:04 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Neo4j graph collections introduction of
 NodeCollection   interface
 
  Hi Niels,
 
  I had wondered about having a collection interface that covered both
 nodes
  and relationships.  There were a couple of reasons I didn't go with that
  right now, though well worthwhile discussing it and going with a
  GraphCollection super interface if it fits properly.
 
  Firstly I wanted to get something out there so people could have a
 look,
  and having something that matched what IndexedRelationship currently
  required was easiest first step.  Biggest thing specific in there to that
  functionality is the addNode method returning a relationship.
 
  The other issue was more wondering how a relationship collection would
 work.
   Say I have a relationship collection, and I have a relationship R1
 between
  node A and B, how am I going to represent that relationship withing some
  graph based data structure that makes sense.  There could be a node X
 that
  is part of the relationship collection data structure (e.g. tree) and
 that
  node could have an attribute that has the relationship id on it, but that
  doesn't seem like it would be very performant.  There could be a
  relationship between X and A that also gave the relationship type of R1,
 so
  you could find the relationship based on that, but there isn't
  any guarantee of the relationship type being unique.  What it would need
 to
  properly model it is the ability to have a relationship between X and R1,
  i.e. a relationship from a node to a relationship.
 
  If instead of being able to add any given relationship to the
 relationship
  collection you instead restrict it to being relationships matching a
 certain
  criteria from a given node then it is practically the same thing as a
  relationship expander.
 
  Or if you instead have a way through the relationship collection to
 create
  relationships from a given node to a set of other arbitrary nodes, with
 the
  relationship collection having a fixed relationship type and direction,
 then
  that is practically the current IndexedRelationship.
 
  I guess a way it could work is similar to IndexedRelationship, basically
  more general case of that class, where you have a method on the
 relationship
  collection createRelationship(startNode, endNode, relationshipType,
  direction) that was then stored in an internal data structure to create a
  pseudo relationship between the start and end, and then being able to
  iterate over this set of relationships.  Not sure exactly what the use
 case
  of that would be.  Maybe of more interest could be the same situation
 where
  the relationship type and direction are fixed, then you may have a
 friend
  of set of relationships that you create between arbitrary nodes and then
  iterate over all of those.
 
  I can't personally think of a good way of adding a set of arbitrary
  relationships into a collection stored in a graph data structure.
 
  Thoughts?
 
  Cheers
  Bryce
 
  P.S. Peter, I had thought to remove the passing in of the graph database
 and
  instead just getting it from the node, or only passing in the graph
 database
  and creating the node internally.
 
  On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Hi Bryce,
   I really like what

Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-16 Thread Bryce
Hi Niels,

I had wondered about having a collection interface that covered both nodes
and relationships.  There were a couple of reasons I didn't go with that
right now, though well worthwhile discussing it and going with a
GraphCollection super interface if it fits properly.

Firstly I wanted to get something out there so people could have a look,
and having something that matched what IndexedRelationship currently
required was easiest first step.  Biggest thing specific in there to that
functionality is the addNode method returning a relationship.

The other issue was more wondering how a relationship collection would work.
 Say I have a relationship collection, and I have a relationship R1 between
node A and B, how am I going to represent that relationship withing some
graph based data structure that makes sense.  There could be a node X that
is part of the relationship collection data structure (e.g. tree) and that
node could have an attribute that has the relationship id on it, but that
doesn't seem like it would be very performant.  There could be a
relationship between X and A that also gave the relationship type of R1, so
you could find the relationship based on that, but there isn't
any guarantee of the relationship type being unique.  What it would need to
properly model it is the ability to have a relationship between X and R1,
i.e. a relationship from a node to a relationship.

If instead of being able to add any given relationship to the relationship
collection you instead restrict it to being relationships matching a certain
criteria from a given node then it is practically the same thing as a
relationship expander.

Or if you instead have a way through the relationship collection to create
relationships from a given node to a set of other arbitrary nodes, with the
relationship collection having a fixed relationship type and direction, then
that is practically the current IndexedRelationship.

I guess a way it could work is similar to IndexedRelationship, basically
more general case of that class, where you have a method on the relationship
collection createRelationship(startNode, endNode, relationshipType,
direction) that was then stored in an internal data structure to create a
pseudo relationship between the start and end, and then being able to
iterate over this set of relationships.  Not sure exactly what the use case
of that would be.  Maybe of more interest could be the same situation where
the relationship type and direction are fixed, then you may have a friend
of set of relationships that you create between arbitrary nodes and then
iterate over all of those.

I can't personally think of a good way of adding a set of arbitrary
relationships into a collection stored in a graph data structure.

Thoughts?

Cheers
Bryce

P.S. Peter, I had thought to remove the passing in of the graph database and
instead just getting it from the node, or only passing in the graph database
and creating the node internally.

On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Hi Bryce,
 I really like what you are trying to achieve here.
 One question:
 Instead of having NodeCollection, why not have GraphCollectionT extends
 PropertyContainer. That way we can have collections of both Relationships
 and Nodes.
 Niels

  Date: Fri, 16 Sep 2011 17:37:29 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: [Neo4j] Neo4j graph collections introduction of NodeCollection
 interface
 
  Hi,
 
  I had mentioned in a previous thread that I was working on introducing a
  NodeCollection interface to remove the dependency from
 IndexedRelationship
  to SortedTree.  I have an initial cut of this up now in my github repo:
  https://github.com/brycenz/graph-collections
 
  It would be great to get community feedback on this as I think that
 having a
  well designed and common NodeCollection interface would help for multiple
  use cases, e.g. sortedTreeNodeCollection.addAll(linkedListNodeCollection)
  doing exactly what you think it would.
 
  IndexedRelationship now takes a node to index relationships from, a
  relationship type, and a direction, as well as a NodeCollection at
 creation
  time.  As in the unit tests this then leads to:
 
  Node indexedNode = graphDb().createNode();
  SortedTree st = new SortedTree( graphDb(), graphDb().createNode(), new
  IdComparator(), true, RelTypes.INDEXED_RELATIONSHIP.name() );
 
  IndexedRelationship ir = new IndexedRelationship( indexedNode,
  RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, st );
 
  To create the IndexedRelationship.  To later add nodes to the
 relationship
  you need to create an instance of IndexedRelationship without the
  NodeCollection:
 
  IndexedRelationship ir = new IndexedRelationship( indexedNode,
  RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING );
 
 
  What this means from a NodeCollection implementation point of view is
 that
  firstly it needs to use the NodeCollection.RelationshipType.VALUE

[Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-15 Thread Bryce
Hi,

I had mentioned in a previous thread that I was working on introducing a
NodeCollection interface to remove the dependency from IndexedRelationship
to SortedTree.  I have an initial cut of this up now in my github repo:
https://github.com/brycenz/graph-collections

It would be great to get community feedback on this as I think that having a
well designed and common NodeCollection interface would help for multiple
use cases, e.g. sortedTreeNodeCollection.addAll(linkedListNodeCollection)
doing exactly what you think it would.

IndexedRelationship now takes a node to index relationships from, a
relationship type, and a direction, as well as a NodeCollection at creation
time.  As in the unit tests this then leads to:

Node indexedNode = graphDb().createNode();
SortedTree st = new SortedTree( graphDb(), graphDb().createNode(), new
IdComparator(), true, RelTypes.INDEXED_RELATIONSHIP.name() );

IndexedRelationship ir = new IndexedRelationship( indexedNode,
RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, st );

To create the IndexedRelationship.  To later add nodes to the relationship
you need to create an instance of IndexedRelationship without the
NodeCollection:

IndexedRelationship ir = new IndexedRelationship( indexedNode,
RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING );


What this means from a NodeCollection implementation point of view is that
firstly it needs to use the NodeCollection.RelationshipType.VALUE
relationship to connect from its internal data structure to the nodes being
added to the collection, and it needs to be able to recreate itself from a
base node that is passed into a constructor (that only takes the base node).
 A node collection also needs to store its class name on the base node for
later construction purposes, as well as any other data required to recreate
the NodeCollection instance (in the case of SortedTree this is the
comparator class, the tree name, and whether it is a unique index.

Niels, you may want to have a good look over SortedTree, I have made a few
changes to it, mainly around introduction of a base node, and changing of
the end value relationships.  This could be cleaned up better, but I wanted
to start with minimal changes.

Both IndexedRelationship and IndexedRelationshipExpander have no
dependencies on SortedTree now, and should work with any properly
implemented NodeCollection.  I will be putting together a paged linked list
NodeCollection next to try this.

Some future thoughts for NodeCollection, addition of as many of the
java.util.Collection methods (e.g. addAll, removeAll, retainAll, contains,
containsAll) as well as an abstract base NodeCollection to help provide
non-optimised support for these methods.

Cheers
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Write Test Compile Error

2011-09-08 Thread Bryce
Hi James,

I initially gave this a go and it worked, but found I had cached version of
the parent pom in my local maven repo.  Removing this I got the same issue
you had.

You can get this working by replacing the parent pom reference in the
pom.xml file with:
  parent
groupIdorg.neo4j.build/groupId
artifactIdparent-central/artifactId
version25/version
  /parent
Instead of the current:
  parent
groupIdorg.neo4j/groupId
artifactIdparent-pom/artifactId
version6/version
  /parent

It still wont work quite as is though.  The build will fail on a license
check, so do compile like:
mvn -Dlicense.failIfMissing=false compile

Should work.

Cheers
Bryce

On Fri, Sep 9, 2011 at 2:12 PM, espeed ja...@jamesthornton.com wrote:

 Hi Guys -

 I'm trying to compile and run the write test
 (https://svn.neo4j.org/laboratory/users/johan/write-test) from
 http://wiki.neo4j.org/content/Linux_Performance_Guide, and I'm getting
 this
 error:

   https://gist.github.com/1205327

 Is there a newer version of this?

 Thanks.

 - James

 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Write-Test-Compile-Error-tp3321729p3321729.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Hi,

As I mentioned a while ago I am looking at using IndexedRelationship's
within my application.  The major thing that was missing for me to be able
to do this was IndexedRelationshipExpander being able to provide all the
relationships from the leaf end of indexed relationships through the the
root end.  So I have been working on getting that support in there.

However in writing this I have discovered a number of other issues that I
have also fixed, and at least one I am still working on.  Since I was right
into the extra support for expanding the relationships it is hard to break
out these fixes as a separate commit (which I think would be ideal), so it
will most likely all come in together hopefully later today (NZ time).

Just letting everyone know in case someone else is doing development against
indexed relationships.

Quick run down of the issues, note: N -- IR(X) -- {A,B} below means there
is a indexed relationship from N to A  B, of type X.

1) Exception thrown when more than one IR terminates at a given node, e.g.:
N1 -- IR(X) -- {A,B,C,D}
N2 -- IR(X) -- {A,X,Y,Z}
Will throw an exception when using the IndexedRelationshipExpander on either
N1, or N2.

2) Start / End nodes are transposed when the IR has an direction of
incoming, i.e. the IR is created against N but across a set of incoming
relationships:
N -- IR(Y) -- {A,B,C}
Will return 3 relationships N -- A, N -- B, N -- C.

I have written tests for each of these, as well as a couple of other tests.

Still completing (1) and have a little question about this.  In order to fix
this I may need to introduce a unique ID stored against the IR both at the
root and at the leaves.  Currently the relationship type is used to name the
IR at both root and leaves, but in the case above that means you can't tell
from node A which KEY_VALUE relationship belongs to which IR tree without
traversing the tree.

So the question is adding this ID would mean that anyone who is already
using this wont have the ID, and therefore without care will be data
incompatible with the updated code.  This could be managed via a check for
the ID when accessing the tree and if it isn't there doing a walk over the
tree to populate all the places where it is required.

In general in developing against this code where do we sit on data
compatibility and API compatibility?

Cheers
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Hi Niels,

Sorry I didn't quite write the bit about (1) clearly enough.  The problem is
that it presently throws an Exception where it shouldn't.

This stems from IndexedRelationship.DirectRelationship:
this.endRelationship = endNode.getSingleRelationship(
SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );

So if the end node has more than one incoming KEY_VALUE relationship a more
than one relationship exception is thrown.

Instead of the getSingleRelationship I was planning on iterating over the
relationships and matching the UUID stored at the root end of the IR with
one of the KEY_VALUE relationships (which is why using a unique id is
necessary rather than the relationship type).  Note: there will actually
still be an issue if the same IR has multiple relationships to the same leaf
node - still thinking about that might need .

Is storing the UUID as two longs much quicker than storing it as a string?
 Curious about this since in my current model I have all the domain objects
with UUID's, and these are all stored as strings.  If it was going to help
with either memory or performance then I would be keen to migrate this to
two longs.

Cheers
Bryce

On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Great work Bryce,
 I do have a question though.
 What is the rationale for the restriction mentioned under 1). Do you need
 this for the general case (to make IndexedRelationshipExpander work
 correctly), or do you need it for your own application to throw that
 exception? If the latter is the case, I think it would be important to tease
 out the general case and offer this new behaviour as an option.
 A unique key for the index is a good idea anyway and can be added to
 SortedTree. Generate a UUID and store it in two long properties. That way
 the two values will always be read in the first fetch of the underlying
 PropertyContainer. A getId method on the TreeNodes can then return a String
 representation of of the two long values.
 IndexRelationships are a relatively new development, so I think you are one
 of the first to actually try it out. Personally I have chosen to directly
 work with SortedTree, because I am working within the framework of a wrapper
 API, so I can integrate the functionality behind the regular
 createRelationshipTo and getRelationships methods.
 I don't think API changes will be an issue at the moment.
 Niels
  Date: Thu, 8 Sep 2011 10:22:11 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: [Neo4j] Issues with IndexedRelationship
 
  Hi,
 
  As I mentioned a while ago I am looking at using IndexedRelationship's
  within my application.  The major thing that was missing for me to be
 able
  to do this was IndexedRelationshipExpander being able to provide all the
  relationships from the leaf end of indexed relationships through the the
  root end.  So I have been working on getting that support in there.
 
  However in writing this I have discovered a number of other issues that I
  have also fixed, and at least one I am still working on.  Since I was
 right
  into the extra support for expanding the relationships it is hard to
 break
  out these fixes as a separate commit (which I think would be ideal), so
 it
  will most likely all come in together hopefully later today (NZ time).
 
  Just letting everyone know in case someone else is doing development
 against
  indexed relationships.
 
  Quick run down of the issues, note: N -- IR(X) -- {A,B} below means
 there
  is a indexed relationship from N to A  B, of type X.
 
  1) Exception thrown when more than one IR terminates at a given node,
 e.g.:
  N1 -- IR(X) -- {A,B,C,D}
  N2 -- IR(X) -- {A,X,Y,Z}
  Will throw an exception when using the IndexedRelationshipExpander on
 either
  N1, or N2.
 
  2) Start / End nodes are transposed when the IR has an direction of
  incoming, i.e. the IR is created against N but across a set of incoming
  relationships:
  N -- IR(Y) -- {A,B,C}
  Will return 3 relationships N -- A, N -- B, N -- C.
 
  I have written tests for each of these, as well as a couple of other
 tests.
 
  Still completing (1) and have a little question about this.  In order to
 fix
  this I may need to introduce a unique ID stored against the IR both at
 the
  root and at the leaves.  Currently the relationship type is used to name
 the
  IR at both root and leaves, but in the case above that means you can't
 tell
  from node A which KEY_VALUE relationship belongs to which IR tree without
  traversing the tree.
 
  So the question is adding this ID would mean that anyone who is already
  using this wont have the ID, and therefore without care will be data
  incompatible with the updated code.  This could be managed via a check
 for
  the ID when accessing the tree and if it isn't there doing a walk over
 the
  tree to populate all the places where it is required.
 
  In general in developing against this code where do we sit on data
  compatibility and API compatibility

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Will have to experiment with changing my id's to be stored as longs, it does
make perfect sense really that it would be better.  Thanks for the hint.

In regards to SortedTree returning the KEY_VALUE relationship instead of the
end Node, I had thought of that too, and it would definitely help.  Could
end up being a significant change to SortedTree though, e.g.:
  sortedTree.addNode( node );
Would need to return the KEY_VALUE relationship instead of a boolean.  Which
not knowing where else SortedTree is used could be a large change?

Maybe SortedTree would have two iterator's available a key_value
relationship iterator, and a node iterator.  Having a quick look at it now
it seems that it could work ok that way without introducing much code
duplication.

On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Two longs is certainly cheaper than a string. Two longs take 128 bit and
 are stored in the main record of the PropertyContainer, while a String would
 require a 64 bit pointer in the main record of the PropertyContainer, and
 an additional read in the String store where the string representation will
 take up 256 bits. So both memory-wise, as perfomance wise, it is better to
 store a UUID as two long values.


 The main issue is something that needs a deeper fix than adding ID's.
 SortedTree now returns Nodes when traversing the tree. We should however
 return the KEY_VALUE Relationship to the indexed Node. Then
 IndexedRelationship.DirectRelationship can be created with that relationship
 as an argument. We get the Direction and the RelationshipType for free.
 Niels

  Date: Thu, 8 Sep 2011 11:36:11 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Issues with IndexedRelationship
 
  Hi Niels,
 
  Sorry I didn't quite write the bit about (1) clearly enough.  The problem
 is
  that it presently throws an Exception where it shouldn't.
 
  This stems from IndexedRelationship.DirectRelationship:
  this.endRelationship = endNode.getSingleRelationship(
  SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
 
  So if the end node has more than one incoming KEY_VALUE relationship a
 more
  than one relationship exception is thrown.
 
  Instead of the getSingleRelationship I was planning on iterating over the
  relationships and matching the UUID stored at the root end of the IR with
  one of the KEY_VALUE relationships (which is why using a unique id is
  necessary rather than the relationship type).  Note: there will actually
  still be an issue if the same IR has multiple relationships to the same
 leaf
  node - still thinking about that might need .
 
  Is storing the UUID as two longs much quicker than storing it as a
 string?
   Curious about this since in my current model I have all the domain
 objects
  with UUID's, and these are all stored as strings.  If it was going to
 help
  with either memory or performance then I would be keen to migrate this to
  two longs.
 
  Cheers
  Bryce
 
  On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Great work Bryce,
   I do have a question though.
   What is the rationale for the restriction mentioned under 1). Do you
 need
   this for the general case (to make IndexedRelationshipExpander work
   correctly), or do you need it for your own application to throw that
   exception? If the latter is the case, I think it would be important to
 tease
   out the general case and offer this new behaviour as an option.
   A unique key for the index is a good idea anyway and can be added to
   SortedTree. Generate a UUID and store it in two long properties. That
 way
   the two values will always be read in the first fetch of the underlying
   PropertyContainer. A getId method on the TreeNodes can then return a
 String
   representation of of the two long values.
   IndexRelationships are a relatively new development, so I think you are
 one
   of the first to actually try it out. Personally I have chosen to
 directly
   work with SortedTree, because I am working within the framework of a
 wrapper
   API, so I can integrate the functionality behind the regular
   createRelationshipTo and getRelationships methods.
   I don't think API changes will be an issue at the moment.
   Niels
Date: Thu, 8 Sep 2011 10:22:11 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: [Neo4j] Issues with IndexedRelationship
   
Hi,
   
As I mentioned a while ago I am looking at using
 IndexedRelationship's
within my application.  The major thing that was missing for me to be
   able
to do this was IndexedRelationshipExpander being able to provide all
 the
relationships from the leaf end of indexed relationships through the
 the
root end.  So I have been working on getting that support in there.
   
However in writing this I have discovered a number of other issues
 that I
have also fixed, and at least one I am still working on.  Since I

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
 this to
two longs.
   
Cheers
Bryce
   
On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:
   

 Great work Bryce,
 I do have a question though.
 What is the rationale for the restriction mentioned under 1). Do
 you
   need
 this for the general case (to make IndexedRelationshipExpander work
 correctly), or do you need it for your own application to throw
 that
 exception? If the latter is the case, I think it would be important
 to
   tease
 out the general case and offer this new behaviour as an option.
 A unique key for the index is a good idea anyway and can be added
 to
 SortedTree. Generate a UUID and store it in two long properties.
 That
   way
 the two values will always be read in the first fetch of the
 underlying
 PropertyContainer. A getId method on the TreeNodes can then return
 a
   String
 representation of of the two long values.
 IndexRelationships are a relatively new development, so I think you
 are
   one
 of the first to actually try it out. Personally I have chosen to
   directly
 work with SortedTree, because I am working within the framework of
 a
   wrapper
 API, so I can integrate the functionality behind the regular
 createRelationshipTo and getRelationships methods.
 I don't think API changes will be an issue at the moment.
 Niels
  Date: Thu, 8 Sep 2011 10:22:11 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: [Neo4j] Issues with IndexedRelationship
 
  Hi,
 
  As I mentioned a while ago I am looking at using
   IndexedRelationship's
  within my application.  The major thing that was missing for me
 to be
 able
  to do this was IndexedRelationshipExpander being able to provide
 all
   the
  relationships from the leaf end of indexed relationships through
 the
   the
  root end.  So I have been working on getting that support in
 there.
 
  However in writing this I have discovered a number of other
 issues
   that I
  have also fixed, and at least one I am still working on.  Since I
 was
 right
  into the extra support for expanding the relationships it is hard
 to
 break
  out these fixes as a separate commit (which I think would be
 ideal),
   so
 it
  will most likely all come in together hopefully later today (NZ
   time).
 
  Just letting everyone know in case someone else is doing
 development
 against
  indexed relationships.
 
  Quick run down of the issues, note: N -- IR(X) -- {A,B} below
 means
 there
  is a indexed relationship from N to A  B, of type X.
 
  1) Exception thrown when more than one IR terminates at a given
 node,
 e.g.:
  N1 -- IR(X) -- {A,B,C,D}
  N2 -- IR(X) -- {A,X,Y,Z}
  Will throw an exception when using the
 IndexedRelationshipExpander on
 either
  N1, or N2.
 
  2) Start / End nodes are transposed when the IR has an direction
 of
  incoming, i.e. the IR is created against N but across a set of
   incoming
  relationships:
  N -- IR(Y) -- {A,B,C}
  Will return 3 relationships N -- A, N -- B, N -- C.
 
  I have written tests for each of these, as well as a couple of
 other
 tests.
 
  Still completing (1) and have a little question about this.  In
 order
   to
 fix
  this I may need to introduce a unique ID stored against the IR
 both
   at
 the
  root and at the leaves.  Currently the relationship type is used
 to
   name
 the
  IR at both root and leaves, but in the case above that means you
   can't
 tell
  from node A which KEY_VALUE relationship belongs to which IR tree
   without
  traversing the tree.
 
  So the question is adding this ID would mean that anyone who is
   already
  using this wont have the ID, and therefore without care will be
 data
  incompatible with the updated code.  This could be managed via a
   check
 for
  the ID when accessing the tree and if it isn't there doing a walk
   over
 the
  tree to populate all the places where it is required.
 
  In general in developing against this code where do we sit on
 data
  compatibility and API compatibility?
 
  Cheers
  Bryce
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
  
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
I have made the changes in regards to SortedTree in regards to relationships
vs nodes, and have got all the tests passing.  The changes are pushed up to
my github account (and pull request has been raised).

The changes can be seen here:
https://github.com/brycenz/graph-collections

On Thu, Sep 8, 2011 at 3:41 PM, Bryce bryc...@gmail.com wrote:

 Another thought if there is going to be a larger refactor of the code is
 whether the indexing mechanism should be broken out as a strategy for the
 IndexedRelationship.  At present it is tied to SortedTree, but if an
 interface was extracted out that had addNode, removeNode, iterator, and
 isUniqueIndex then other indexing implementations could be used in certain
 cases.

 The particular other implementation I am currently thinking of that could
 be of use to me would be a paged linked list.  So that would have a linked
 list of pages, each with min  x  max KEY_VALUE (or equivalent)
 relationships.  I think that could work quite well for the situation where
 the index is descending date ordered, and generally just appended at the
 most recent end, and results are retrieved in a paged manner generally from
 near the most recent.

 But more to the point there could be any number of implementations that
 would be good for given different situations.

 That does bring up a question though, there was some discussion a while ago
 about some functionality along the lines of IndexedRelationship being pulled
 into the core, so is that overkill for now if there is going to be another
 core offering later?


 On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:


 I think we don't have to worry about backwards compatibility much yet.
 There has not been a formal release of the component, so if there are people
 using the software, they will accept that they are bleeding edgers.
 Indeed addNode should return the KEY_VALUE relationship and I think we
 should change the signature of SortedTree to turn it into
 IterableRelationship. No need to maintain a Node iterator, the node is
 always one getEndNode away.
 Niels

  Date: Thu, 8 Sep 2011 14:17:59 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Issues with IndexedRelationship
 
  Will have to experiment with changing my id's to be stored as longs, it
 does
  make perfect sense really that it would be better.  Thanks for the hint.
 
  In regards to SortedTree returning the KEY_VALUE relationship instead of
 the
  end Node, I had thought of that too, and it would definitely help.
  Could
  end up being a significant change to SortedTree though, e.g.:
sortedTree.addNode( node );
  Would need to return the KEY_VALUE relationship instead of a boolean.
  Which
  not knowing where else SortedTree is used could be a large change?
 
  Maybe SortedTree would have two iterator's available a key_value
  relationship iterator, and a node iterator.  Having a quick look at it
 now
  it seems that it could work ok that way without introducing much code
  duplication.
 
  On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Two longs is certainly cheaper than a string. Two longs take 128 bit
 and
   are stored in the main record of the PropertyContainer, while a String
 would
   require a 64 bit pointer in the main record of the
 PropertyContainer, and
   an additional read in the String store where the string representation
 will
   take up 256 bits. So both memory-wise, as perfomance wise, it is
 better to
   store a UUID as two long values.
  
  
   The main issue is something that needs a deeper fix than adding ID's.
   SortedTree now returns Nodes when traversing the tree. We should
 however
   return the KEY_VALUE Relationship to the indexed Node. Then
   IndexedRelationship.DirectRelationship can be created with that
 relationship
   as an argument. We get the Direction and the RelationshipType for
 free.
   Niels
  
Date: Thu, 8 Sep 2011 11:36:11 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Issues with IndexedRelationship
   
Hi Niels,
   
Sorry I didn't quite write the bit about (1) clearly enough.  The
 problem
   is
that it presently throws an Exception where it shouldn't.
   
This stems from IndexedRelationship.DirectRelationship:
this.endRelationship = endNode.getSingleRelationship(
SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
   
So if the end node has more than one incoming KEY_VALUE relationship
 a
   more
than one relationship exception is thrown.
   
Instead of the getSingleRelationship I was planning on iterating
 over the
relationships and matching the UUID stored at the root end of the IR
 with
one of the KEY_VALUE relationships (which is why using a unique id
 is
necessary rather than the relationship type).  Note: there will
 actually
still be an issue if the same IR has multiple relationships to the
 same

Re: [Neo4j] IndexedRelationship some observations and questions

2011-09-05 Thread Bryce
Hi Niels,

Thanks for your feedback below.

Specifically on the anonymous inner class issue in general they can be used
as a Comparator without a problem, and in this case it worked perfectly for
creating the IndexedRelationship.  The issue was when then trying to
instantiate the anonymous inner class in IndexedRelationshipExpander which
was throwing a ClassNotFoundException (I think, from memory).

I also think non-static inner classes also wouldn't work as it is coded
right now, since they need a reference to the outer class, and personally
supporting them may be rather difficult.

Even if anonymous inner classes could be supported, is it really a good idea
to store as a comparator XYZClass$5 when adding another anonymous inner
class to the XYZClass would increment the number and mean the index will no
longer work.

Given what you have said about how the tree is represented it should be
relatively easy to have IndexedRelationshipExpander support following
relationships from the leaf nodes through to the root (as well as what it
presently does: root - leaf  direct relationships).  I might have a look
at doing that later on today.

Cheers
Bryce

On Tue, Sep 6, 2011 at 6:58 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Hi Bryce,
 Sorry for my belated response. I have been away for a couple of days and
 wasn't able to check my emails.
 I am glad you took the time to look into the IndexRelationship module. It
 certainly could use some scrutiny.
 Remarks:
 1)  Good catch... Something the unit test didn't catch because it runs in
 the same namespace as IndexedRelationship itself. Didn't catch it in user
 code either, because personally I prefer to directly call SortedTree.
 2) Agreed. It should be possible to define more than one IndexRelationship
 per node.
 3) I haven't tried out an anonymous inner class as Comparator. As far as I
 can tell any object implementing ComparatorNode should be able to work as
 a comparator.

 Questions:
 1) That is certainly an option. IndexRelationships however offer you the
 possibility to sort your Relationships based on some value associated with a
 node (for example creation/edit date of the document). This may be a reason
 to use IndexRelationships even in the situation where you have less than 500
 entries per tag (though it would be possible to do that sorting in memory
 too).
 2) The end node of an IndexRelationship is always referred to by a
 Relationship with RelationshipType KEY_VALUE, and has a property
 tree_name (both are defined in SortedTree). The tree_name property has
 the same value as the RelationshipType.name used in IndexRelationship. To
 traverse from a leaf node to the tree root, keep following the incoming
 relationships: KEY_VALUE (there is only one), KEY_ENTRY (there can be many),
 SUB_TREE (there can be many), TREE_ROOT (there is only one)
 Niels


  Date: Fri, 2 Sep 2011 11:44:40 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: [Neo4j] IndexedRelationship some observations and questions
 
  Hi,
 
  I have been looking at performance options for Neo4j as presently I have
  been observing a number of performance issues.  I am still investigating
 the
  way to get the best performance out of what I am doing, and one thing it
  might be are longer running transactions stopping other work going on
 (but
  thats an aside to what this message is about).
 
  One of the things that I investigated using was the IndexedRelationship
 work
  by Niels.  Thought I would give a bit of feedback, although I haven't
 quite
  got this implemented at present.
 
  1) I had to change the IndexedRelationshipExpander to be a public class
 in
  order to use it outside the package its in.
 
  2) IndexedRelationship assumes only one tree root per node, whereas the
  expander allows for multiple (IndexedRelationship uses
 getSingleRelationship
  vs expander using getRelationships then matching on tree name).  Having
  multiple would obviously be good as it means you could have two types of
  relationships covered by IndexedRelationship's.
 
  3) Might pay to make it clear in the Javadocs for IndexedRelationship
 that
  the comparator can't be an anonymous inner class.
 
  Then I have some questions about usage of this.  First a little
 background
  of the model I have, from reading a few things it seems quite standard.
   There are a lot of document nodes each of which have a relationship
 with
  multiple tag nodes.  Documents generally have in the order of 10-20
 tags,
  and tags can have as few as 1 document and sometimes tens of thousands.
   When tags are viewed through the UI they are almost always displayed
 with a
  descending date ordered list of documents.  Seemed to be to fit quite
 well
  with IndexedRelationship.
 
  1) I was thinking of having a switch over point at say around 500
 documents
  for a given node where I will switch from using normal relationships to
 an
  IndexedRelationship as I was thinking at small numbers

[Neo4j] IndexedRelationship some observations and questions

2011-09-01 Thread Bryce
Hi,

I have been looking at performance options for Neo4j as presently I have
been observing a number of performance issues.  I am still investigating the
way to get the best performance out of what I am doing, and one thing it
might be are longer running transactions stopping other work going on (but
thats an aside to what this message is about).

One of the things that I investigated using was the IndexedRelationship work
by Niels.  Thought I would give a bit of feedback, although I haven't quite
got this implemented at present.

1) I had to change the IndexedRelationshipExpander to be a public class in
order to use it outside the package its in.

2) IndexedRelationship assumes only one tree root per node, whereas the
expander allows for multiple (IndexedRelationship uses getSingleRelationship
vs expander using getRelationships then matching on tree name).  Having
multiple would obviously be good as it means you could have two types of
relationships covered by IndexedRelationship's.

3) Might pay to make it clear in the Javadocs for IndexedRelationship that
the comparator can't be an anonymous inner class.

Then I have some questions about usage of this.  First a little background
of the model I have, from reading a few things it seems quite standard.
 There are a lot of document nodes each of which have a relationship with
multiple tag nodes.  Documents generally have in the order of 10-20 tags,
and tags can have as few as 1 document and sometimes tens of thousands.
 When tags are viewed through the UI they are almost always displayed with a
descending date ordered list of documents.  Seemed to be to fit quite well
with IndexedRelationship.

1) I was thinking of having a switch over point at say around 500 documents
for a given node where I will switch from using normal relationships to an
IndexedRelationship as I was thinking at small numbers of relationships
normal relationships would be quicker.  Would that be correct, or not worth
it?

2) On the tag end (which is the incoming end of the document-tag
relationship) I was going to use a IndexedRelationshipExpander which would
cover the case of whether the relationship was done through normal
relationships, or through an IndexedRelationship.  I also need to get a set
of tags from the document end where their may be both normal relationships,
and relationships coming from multiple IndexedRelationship's.  From looking
at it IndexedRelationshipExpander doesn't cover the reverse direction, but I
would imagine using a relationship expander here would be correct.  What
would the best way of doing this be?

As an aside it may be a good idea to note in the configuration settings
page:
http://wiki.neo4j.org/content/Configuration_Settings#Optimizing_for_traversals_example
that -XX:+UseNUMA
only works when using the Parallel Scavenger garbage collector (default
or -XX:+UseParallelGC) not the concurrent mark and sweep one.  Based on

Cheers
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Invitation to connect on LinkedIn

2011-08-05 Thread bryce hendrix via LinkedIn
LinkedIn





bryce hendrix requested to add you as a connection on LinkedIn:
  
--

Craig,

I'd like to add you to my professional network on LinkedIn.

- bryce

Accept invitation from bryce hendrix
http://www.linkedin.com/e/5gyj7a-gqzdnnl0-p/h9LPQ_TdyUOQHKzIpND15vYO56OQOUsn/blk/I181440225_9/pmpxnSRJrSdvj4R5fnhv9ClRsDgZp6lQs6lzoQ5AomZIpn8_elYRcz8Md3gNe359bQR9pQ9Tt6NBbPoRdPcPdz0TdPcLrCBxbOYWrSlI/EML_comm_afe/

View invitation from bryce hendrix
http://www.linkedin.com/e/5gyj7a-gqzdnnl0-p/h9LPQ_TdyUOQHKzIpND15vYO56OQOUsn/blk/I181440225_9/0VnPkOcz0Qd34UckALqnpPbOYWrSlI/svi/
 
--

DID YOU KNOW that LinkedIn can find the answers to your most difficult 
questions? Post those vexing questions on LinkedIn Answers to tap into the 
knowledge of the world's foremost business experts: 
http://www.linkedin.com/e/5gyj7a-gqzdnnl0-p/ask/inv-23/
 
-- 
(c) 2011, LinkedIn Corporation
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Invitation to connect on LinkedIn

2011-08-05 Thread bryce hendrix via LinkedIn
LinkedIn





bryce hendrix requested to add you as a connection on LinkedIn:
  
--

Craig,

I'd like to add you to my professional network on LinkedIn.

- bryce

Accept invitation from bryce hendrix
http://www.linkedin.com/e/5gyj7a-gr021ccv-5u/h9LPQ_TdyUOQHKzIpND15vYO56OQOUsn/blk/I181685058_9/pmpxnSRJrSdvj4R5fnhv9ClRsDgZp6lQs6lzoQ5AomZIpn8_elYUdj0Re3oNe359bQR9pQ9Tt6NBbPoRdPcPdz0TdPcLrCBxbOYWrSlI/EML_comm_afe/

View invitation from bryce hendrix
http://www.linkedin.com/e/5gyj7a-gr021ccv-5u/h9LPQ_TdyUOQHKzIpND15vYO56OQOUsn/blk/I181685058_9/0VnPwRc3kUdz4UckALqnpPbOYWrSlI/svi/
 
--

DID YOU KNOW that LinkedIn can find the answers to your most difficult 
questions? Post those vexing questions on LinkedIn Answers to tap into the 
knowledge of the world's foremost business experts: 
http://www.linkedin.com/e/5gyj7a-gr021ccv-5u/ask/inv-23/
 
-- 
(c) 2011, LinkedIn Corporation
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] path finding using OSM ways

2011-05-30 Thread bryce hendrix
I am finally getting back to experimenting with Neo4j. Because it has been a
while since I last looked at it, I've forgotten just about everything. I
want to start with something simple, is there any sample code which does A*
path finding over OSM ways?

Thanks,
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Getting started with Neo4J Spatial

2011-02-18 Thread bryce hendrix
Nolan,

The first experience with Neo4j Spatial was with the texas.osm file. I
imported it on my notebook  I think it took 15 hours, if I remember
correctly. I quickly decided to play around with just Austin for the time
being. If you'd like, i can zip up my Austin file (just Austin, not any of
the suburbs) and send it to you. Austin takes about 30 secs to import and
index.

Bryce

On Fri, Feb 18, 2011 at 2:41 PM, Nolan Darilek no...@thewordnerd.infowrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 02/18/2011 11:23 AM, Craig Taverner wrote:
 
  I will answer inline below.
 

 Thanks, I'll do the same.

 
  Neo4j Spatial is a good choice for OSM data, and can load OSM files and
  expose them to GIS application for analysis. Are you planning to run this
 on
  a mobile device? The question of whether or not, or how well, Neo4j runs
 on
  embedded devices would need to be answered by others. I only use it (so
 far)
  on PCs. I know there was an android prototype a few years back, but I do
 not
  believe android (or other platforms) are officially supported. If you
 have a
  backend server for neo4j (which it sounds like you do), then there is no
  problem.
 

 Right, currently my project is client-server, with an Android app or web
 browser communicating with a web-based server. I've made a rough cut at
 a database abstraction so I can move away from LibOSM to Neo4J and,
 presumably, to other database types in the future should I find
 something that works well on mobile devices. At the moment, that's
 almost certainly not Neo4J Spatial. It's been chugging away importing
 Texas on my modest desktop for several hours, and the file is nearly at
 30 gigs. Considering that this is much larger than either H2 or
 MongoDB's storage of the same data, I'm hoping there are plans to
 optimize this in the future. :) I saw something about small strings here
 recently, and how this might decrease storage requirements by 80%. Is
 that just for strings, or across the entire database? For reference, the
 uncompressed OSM data is only 5G so a 6X increase worries me.

 Anyhow, information on what I'm doing can be found at my project's
 development site, http://hermesgps.info. There's also an instance which
 may or may not be running. A live demo of an actual map can sometimes be
 seen at http://thewordnerd.homeip.net/#30.267153,-97.74306 though
 performance is pretty bad right now since my desktop is taxed with the
 Neo4J import. I'll refer back to that later on.

  what you really want. I think I would need to know more about your needs
  before recommending further.
 

 So at the above link, you see a list of nearby points, along with a
 direction and distance. The same information is also exported via web
 services to mobile apps in a scrollable list. What I'd hoped to do was
 to create an infinitely-scrolling list as commonly found in other
 Android apps. Unfortunately, it's hard to do that by simply bumping out
 the distance, since you can't necessarily know whether adding X to the
 bounding box dimensions might give you no points or another thousand. :)
 Since MongoDB can do this with a rather limited geospatial
 implementation, I was hoping that Neo4J or something tuned for spatial
 queries would *really* rock at it.

 
  Certainly Neo4j can import OSM data into a decent graph. The OSM model in
  Neo4j is somewhat similar to the graph model in the OSM xml files, but
  expressed in a more graph-friendly way. The only public information on
 the
  model so far is in the presentation I gave at FOSS4J. See these at
  slideshare:
 
 http://www.slideshare.net/craigtaverner/neo4j-spatial-backing-a-gis-with-a-true-graph-database
  (slides
  13, 14 and 15 show graphs of the OSM model in the database).
 

 I'd appreciate a description of those slides if you or someone else has
 the time, as I can't see them. :) In any case, I'll try playing with the
 database myself once this import finishes and see what I can learn that
 way.

 
  If you want to do this in Java, take a look at the sample code at
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestDynamicLayers.java
 

 I'm doing this in Scala, so thanks for pointing me to the Java example.
 I've been seeing lots of old testcases and such so it was tough to find
 ones that were still valid.

 So I wrote a simple Scala script that is currently importing OSM data,
 and this raises a few questions. The README shows a shapefile being
 imported without a BatchInserter. Is there any way to do this with OSM
 as well? One feature I'd like to provide at some point are automatic map
 updates, so each week an OSM changeset would be fetched and merged in.
 It would be great if I could do that merge without having to shut down
 the live database, and to handle that changeset merge as a single
 transaction. I gather changeset imports aren't yet possible, but is
 there any way to forgo the BatchInserter? Even if the process is slower

Re: [Neo4j] Searching for ways in imported OSM data

2011-01-18 Thread bryce hendrix
Peter,

I think my problem is more of understanding how the graph is built from OSM
data. The method of find the closest edge gives me the way geometry, but
(please correct me if I am wrong) I need to find the closest OSM node (Node
with lat and lon properties) in the geometry.

Thanks,
Bryce

On Tue, Jan 18, 2011 at 2:05 AM, Peter Neubauer 
peter.neuba...@neotechnology.com wrote:

 Bryce,
 are you looking for some kind of snapping to the nearest geometry? I
 was doing something like that, Craig did a test on that, see

 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestSpatialUtils.java#L53
 ,
 there is some basic support for that in

 SpatialTopologyUtils.java.

 Does that help to get started? Feel free to add more utilities and
 maybe a routing example, when you get it working.

 Cheers,

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 http://www.neo4j.org   - Your high performance graph database.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



 On Tue, Jan 18, 2011 at 5:50 AM, bryce hendrix brycehend...@gmail.com
 wrote:
  Craig,
 
  I've ran into a bit of a stumbling block. I am attempting to do find a
  simple route using A* from 2 nodes on 2 different ways. What is the best
 way
  to find the points closest to 2 reference points which are on ways?
 Assuming
  I've got those nodes, is there anything special I have to do, other than
  setting up the estimate and cost evaluators?
 
  If I can figure this out, I'll submit my example back to you guys. I'm
  excited about this stuff, but I seem to be discovering the limits of the
  docs.
 
  Bryce
 
  On Sun, Jan 16, 2011 at 7:11 PM, Craig Taverner cr...@amanzi.com
 wrote:
 
  Great that they all work :-)
 
  Good luck with the rest of the project and keep us posted, we're
 interested
  in any feedback on the API.
 
  (getting lat/long mixed up is one of those things we all keep doing, I'm
  pretty sure I did it once as recently as last month ... ;-)
 
  On Sun, Jan 16, 2011 at 11:52 PM, bryce hendrix brycehend...@gmail.com
  wrote:
 
   Craig, Peter,
  
   Its useful if I get the latitude and longitude in the correct order
 for
  the
   Point. Ugh. I've found
   that SearchPointsWithinOrthodromicDistance, SearchClosest,
   and SpatialTopologyUtils.findClosestEdges all work for me. Looks like
 my
   project is well on its way now, thanks for the help.
  
   Bryce
  
   On Sun, Jan 16, 2011 at 3:16 PM, Craig Taverner cr...@amanzi.com
  wrote:
  
The SearchPointsWithinOrthodromicDistance basically does a search on
 a
rectangular bounding box, and then inside the result set filters by
distance
from the center. The filter probably works only on points as implied
 by
   the
class name.
   
The SpatialTopologyUtils class has a method findClosestEdge, which
 will
   do
what you are looking for. If you call it without a distance value,
 it
   will
take 1% of the total span of your layer as the search window, so if
  this
does not make sense for your data (eg. your layer covers a small
 area,
  as
you hinted at), then pass in the distance in units of the coordinate
   system
of the layer (probably WGS84, degrees, if you are using only OSM
 data).
   Try
it out and let us know.
   
See:
   
  -
   
   
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/main/java/org/neo4j/gis/spatial/SpatialTopologyUtils.java
  -
   
   
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestSpatialUtils.java
   
   
On Sun, Jan 16, 2011 at 11:19 AM, Peter Neubauer 
peter.neuba...@neotechnology.com wrote:
   
 Bryce,
 I think (Craig, correct me if I'm wrong) you need to have a Point
 layer to be able to do that search. The default OSM layer is
 containing a lot of geometries, so I think you first should define
 a
 layer on top of the full imported one, then search. I did
 something
 like that in another spike, see


   
  
 
 https://github.com/popdevelop/snapplr/blob/master/server_java/src/main/java/com/geosnappr/TaginfoImporter.java#L312

 The layer is defined with something like


   
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestDynamicLayers.java#L26
 on top of the imported full data layer.

 Does that help?

 Cheers,

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 http://www.neo4j.org   - Your high performance graph
database.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
   party

Re: [Neo4j] Searching for ways in imported OSM data

2011-01-18 Thread bryce hendrix
Thanks Craig! I'll not only play around with it tonight, but you nudged me
towards finally installing uDig.

Bryce

On Tue, Jan 18, 2011 at 6:45 PM, Craig Taverner cr...@amanzi.com wrote:

 Hi Bryce,

 While I had this on my mind, I decided to look into the code and I went
 ahead developed the first approach. So now we index osm nodes in the r-tree
 also (in addition to ways). By default, only nodes that have tags are added
 to the index, because these are seen as 'points of interest', while all
 other nodes are assumed to only be interesting as part of a way. However, I
 added a new method to the OSMImporter that allows you to turn on indexing
 of
 all nodes, regardless of whether they are points of interest or not.

 Just replace the old call to osmImporter.importFile(batchInserter,filePath)
 with instead osmImporter.importFile(batchInserter,filePath,true). The last
 parameter turns on indexing of all nodes. I've attached an image from the
 AWE product (based on uDig) that shows a large number of dynamic layers
 exported from the OSM model of Malmö, and you can see all nodes are
 visible.
 This is the output from the TestDynamicLayers unit test, run with the
 boolean set to true.

 If you are curious, the Neo Technology office is actually included in this
 image :-)

 I have just pushed the code to github, so you need to get the latest from
 there.

 Regards, Craig

 On Tue, Jan 18, 2011 at 9:27 PM, Craig Taverner cr...@amanzi.com wrote:

  Hi Bryce,
 
  Currently the OSM model in Neo4j Spatial only indexes the ways, which
 means
  that when you do spatial searches on the layers, you are always talking
  about the  ways. Whether this suites you or not depends on your specific
 use
  case. You could search for the closest ways, and the use the
  SpatialTopologyUtils to snap to the point node on the way that is closest
 to
  your point of interest, which may give you the result you want.
 
  If you are specifically interested in only searching on nodes, not ways,
  then there are two options, both requiring some refactoring of the OSM
 model
  (ie. you need to make changes to Neo4j Spatial).
 
 - In the OSMImporter code, simply add all nodes as Point geometries to
 the index. This is simple to code and will get what you want, but
 might not
 lead to the most efficient index, since both the ways and the nodes
 they are
 composed of get indexed in the same index.
 - Enhance the OSM model to support multiple indexes, for nodes, ways
 and relationships. This requires much more code, but opens the
 possibility
 to enhance the performance by allowing the index to be optimized for
 the
 geometries and densities seen in the dataset.
 
  The reasons we have not done this yet are:
 
 - Existing use cases have not required it (but we are reaching cases
 that do now)
 - It is not clear yet which of the above options is the best overall
 (one index or several). Originally I believed in the multiple-index
 approach, mostly because points can be indexed with very different
 indexes
 than geometries (eg. we can even place lucene behind the point index,
 which
 we cannot do for geometries). But at the moment I am leaning back
 towards
 the single index approach (simpler code).
 
  So, I hope the approach of searching for close ways, and then using the
  topology utils to snap to the closest point is the right solution for
 you.
  Otherwise considering contributing to the updates necessary to complete
 the
  point index :-)
 
  Regards, Craig
 
  On Tue, Jan 18, 2011 at 6:08 PM, bryce hendrix brycehend...@gmail.com
 wrote:
 
  Peter,
 
  I think my problem is more of understanding how the graph is built from
  OSM
  data. The method of find the closest edge gives me the way geometry, but
  (please correct me if I am wrong) I need to find the closest OSM node
  (Node
  with lat and lon properties) in the geometry.
 
  Thanks,
  Bryce
 
  On Tue, Jan 18, 2011 at 2:05 AM, Peter Neubauer 
  peter.neuba...@neotechnology.com wrote:
 
   Bryce,
   are you looking for some kind of snapping to the nearest geometry? I
   was doing something like that, Craig did a test on that, see
  
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestSpatialUtils.java#L53
   ,
   there is some basic support for that in
  
   SpatialTopologyUtils.java.
  
   Does that help to get started? Feel free to add more utilities and
   maybe a routing example, when you get it working.
  
   Cheers,
  
   /peter neubauer
  
   GTalk:  neubauer.peter
   Skype   peter.neubauer
   Phone   +46 704 106975
   LinkedIn   http://www.linkedin.com/in/neubauer
   Twitter  http://twitter.com/peterneubauer
  
   http://www.neo4j.org   - Your high performance graph
  database.
   http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
 party.
  
  
  
   On Tue, Jan 18, 2011 at 5:50 AM, bryce hendrix 
 brycehend...@gmail.com

Re: [Neo4j] Searching for ways in imported OSM data

2011-01-17 Thread bryce hendrix
Craig,

I've ran into a bit of a stumbling block. I am attempting to do find a
simple route using A* from 2 nodes on 2 different ways. What is the best way
to find the points closest to 2 reference points which are on ways? Assuming
I've got those nodes, is there anything special I have to do, other than
setting up the estimate and cost evaluators?

If I can figure this out, I'll submit my example back to you guys. I'm
excited about this stuff, but I seem to be discovering the limits of the
docs.

Bryce

On Sun, Jan 16, 2011 at 7:11 PM, Craig Taverner cr...@amanzi.com wrote:

 Great that they all work :-)

 Good luck with the rest of the project and keep us posted, we're interested
 in any feedback on the API.

 (getting lat/long mixed up is one of those things we all keep doing, I'm
 pretty sure I did it once as recently as last month ... ;-)

 On Sun, Jan 16, 2011 at 11:52 PM, bryce hendrix brycehend...@gmail.com
 wrote:

  Craig, Peter,
 
  Its useful if I get the latitude and longitude in the correct order for
 the
  Point. Ugh. I've found
  that SearchPointsWithinOrthodromicDistance, SearchClosest,
  and SpatialTopologyUtils.findClosestEdges all work for me. Looks like my
  project is well on its way now, thanks for the help.
 
  Bryce
 
  On Sun, Jan 16, 2011 at 3:16 PM, Craig Taverner cr...@amanzi.com
 wrote:
 
   The SearchPointsWithinOrthodromicDistance basically does a search on a
   rectangular bounding box, and then inside the result set filters by
   distance
   from the center. The filter probably works only on points as implied by
  the
   class name.
  
   The SpatialTopologyUtils class has a method findClosestEdge, which will
  do
   what you are looking for. If you call it without a distance value, it
  will
   take 1% of the total span of your layer as the search window, so if
 this
   does not make sense for your data (eg. your layer covers a small area,
 as
   you hinted at), then pass in the distance in units of the coordinate
  system
   of the layer (probably WGS84, degrees, if you are using only OSM data).
  Try
   it out and let us know.
  
   See:
  
 -
  
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/main/java/org/neo4j/gis/spatial/SpatialTopologyUtils.java
 -
  
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestSpatialUtils.java
  
  
   On Sun, Jan 16, 2011 at 11:19 AM, Peter Neubauer 
   peter.neuba...@neotechnology.com wrote:
  
Bryce,
I think (Craig, correct me if I'm wrong) you need to have a Point
layer to be able to do that search. The default OSM layer is
containing a lot of geometries, so I think you first should define a
layer on top of the full imported one, then search. I did something
like that in another spike, see
   
   
  
 
 https://github.com/popdevelop/snapplr/blob/master/server_java/src/main/java/com/geosnappr/TaginfoImporter.java#L312
   
The layer is defined with something like
   
   
  
 
 https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/TestDynamicLayers.java#L26
on top of the imported full data layer.
   
Does that help?
   
Cheers,
   
/peter neubauer
   
GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer
   
http://www.neo4j.org   - Your high performance graph
   database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
  party.
   
   
   
On Sat, Jan 15, 2011 at 10:58 PM, bryce hendrix 
  brycehend...@gmail.com
wrote:
 I'm pretty new to neo4j, so please excuse me if this is a FAQ.

 I exported OSM data for a city from the OSM site, then imported it
   using
the
 OSMImporter. I can see the layer via the webserver, so I know if
 got
 imported okay. Now I would like to find the way nearest to a
  coordinate
via
 the Java API, but I'm not really sure how to do that. I've tried
  using
 SearchPointsWithinOrthodromicDistance, but the results of the query
  are
 always empty. Can someone give me some tips, or provide a simple
   example?

 Thanks in advance,
 Bryce
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
   
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user
  
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https

[Neo4j] Searching for ways in imported OSM data

2011-01-15 Thread bryce hendrix
I'm pretty new to neo4j, so please excuse me if this is a FAQ.

I exported OSM data for a city from the OSM site, then imported it using the
OSMImporter. I can see the layer via the webserver, so I know if got
imported okay. Now I would like to find the way nearest to a coordinate via
the Java API, but I'm not really sure how to do that. I've tried using
SearchPointsWithinOrthodromicDistance, but the results of the query are
always empty. Can someone give me some tips, or provide a simple example?

Thanks in advance,
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user