Re: [Neo4j] Benchmarking Neo4j with Rtree index -v- PostgreSQL/PostGIS (Peter Neubauer) (Peter Neubauer)

2011-02-01 Thread Dave Hesketh
Peter
As you suggested, I rerun the searches without closing the db. On the 2nd
iteration, the search times dropped by 66% and leveled-off after that.
Unfortunately I don't understand your suggestion: 'You can even warm up
the OS file caches by doing dd nodestore.id >> /dev/null for all store files
a couple of times'. I can't find the command or executable dd. I'm running
on Windows 7 and what you suggest looks to me like a Unix or Linux command.
Rgds Dave
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Benchmarking Neo4j with Rtree index -v- PostgreSQL/PostGIS (Peter Neubauer)

2011-02-01 Thread Dave Hesketh
Peter,
Sorry I am slow replying. I'm on an extended visit to South Africa and my
internet access is a very unreliable 2G (yes 2G, occasional 3G) mobile
access.
With my previous postings to the forum, I have tried to attach all the
information (including coding) of the test setup and results (pdf's, text
files, jave files) but your system tries to embed them in the body of the
text. Can you suggest somewhere I can post them or email address I can send
them?
In summary, I am using amended versions of Davide's routines
(ShapeFileImporter to build the database and index and SearchWithin for the
searching) downloaded from GitHub. The first routine uses RTreeIndex you
suggested.
I will try your suggestions for running the tests warm and reported back the
results.
Thanks for your help
Dave
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] User Digest, Vol 45, Issue 39

2011-01-12 Thread Dave Hesketh
Davide
Sorry for slow response - been on extended leave.
I have already amended ShapeFileImporter to included maxNodeReferences (and
minNodeReferences - aka M,m) and found they made precious little difference
to the performance. As a result, I assumed that I needed to ensure I had the
optimal basic Neo4j settings and repeat my tests for different value of M
and m. I am currently writing-up my results so far and will post them in the
next days or so.
Thanks Dave

On 20 December 2010 18:32,  wrote:

> Send User mailing list submissions to
>user@lists.neo4j.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>https://lists.neo4j.org/mailman/listinfo/user
> or, via email, send a message with subject or body 'help' to
>user-requ...@lists.neo4j.org
>
> You can reach the person managing the list at
>user-ow...@lists.neo4j.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of User digest..."
>
>
> Today's Topics:
>
>   1. Re:  R-Tree indexing performance (Davide)
>   2.  New Index API replacing the old one in REST (Peter Neubauer)
>   3. Re:  New Index API replacing the old one in REST
>  (Javier de la Rosa)
>   4. Re:  Transaction and REST API (Jim Webber)
>   5. Re:  New Index API replacing the old one in REST (Peter Neubauer)
>   6. Re:  Transaction and REST API (Ido Ran)
>   7.  (no subject) (Francois Kassis)
>   8. Re:  A TinkerPop Stack Release (Peter Neubauer)
>   9. Re:  A TinkerPop Stack Release (Marko Rodriguez)
>  10.  RRD dependency not having a valid POM (Peter Neubauer)
>  11. Re:  A TinkerPop Stack Release (Peter Neubauer)
>
>
> --
>
> Message: 1
> Date: Mon, 20 Dec 2010 12:13:39 +0100
> From: Davide 
> Subject: Re: [Neo4j] R-Tree indexing performance
> To: Neo4j user discussions 
> Message-ID:
>
> 
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Sat, Dec 18, 2010 at 00:48, Craig Taverner  wrote:
> > ? - The RTree implementation itself - I know RTree's are not all equal,
> so
> > ? there may be room for general RTree improvements and optimizations. As
> > ? mentioned we have not put much time into optimizing the RTree very
> much, so
> > ? hopefully there is room to move here.
>
> Hi all,
>
> yes, the RTree algorithm can surely be improved!
>
> But first of all, I'd try to find the optimal value for the RTree
> maxNodeReferences parameter.
> This parameter should affect performance.
>
> You can pass this parameter to the RTree constructor:
>
> public RTreeIndex(GraphDatabaseService database, Layer layer, int
> maxNodeReferences, int minNodeReferences) {
>
> but after a RTree is created, you can't change it. Now the default value is
> 100.
>
> Cheers,
> --
> Davide Savazzi
>
>
> --
>
> Message: 2
> Date: Mon, 20 Dec 2010 12:18:44 +0100
> From: Peter Neubauer 
> Subject: [Neo4j] New Index API replacing the old one in REST
> To: Neo4j user discussions ,  neo4jrb
>
> Message-ID:
>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Graphytes,
> the exposure of the new Index on relationships and nodes in the Neo4j
> REST API will be part of M06, so anyone using the Index in their
> bindings will need to switch to that API. I am thinking of the PHP,
> .NET, Ruby and Perl bindings (Max, @onewheelgood, Javier and all
> others).
>
> Is there anyone heavily relying on the current REST Index API in
> production and needs it? If so, we will need to add support for it
> after the release as a server plugin. Since all this is happening
> still in Milestones, I think we are good just removing the existing
> one if nobody speaks up.
>
> Cheers,
>
> /peter neubauer
>
> GTalk:? ? ? neubauer.peter
> Skype? ? ?? peter.neubauer
> Phone? ? ?? +46 704 106975
> LinkedIn?? http://www.linkedin.com/in/neubauer
> Twitter? ? ? http://twitter.com/peterneubauer
>
> http://www.neo4j.org? ? ? ? ? ? ?? - Your high performance graph database.
> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
>
>
> --
>
> Message: 3
> Date: Mon, 20 Dec 2010 12:21:20 +0100
> From: Javier de la Rosa 
> Subject: Re: [Neo4j] New Index API replacing the old one in REST
> To: Neo4j user discussions 
> Cc: neo4jrb 
> Message-ID:
>
> 
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Mon, Dec 20, 2010 at 12:18, Peter Neubauer
>  wrote:
> > Graphytes,
> > the exposure of the new Index on relationships and nodes in the Neo4j
> > REST API will be part of M06, so anyone using the Index in their
> > bindings will need to switch to that API. I am thinking of the PHP,
> > .NET, Ruby and Perl bindings (Max, @onewheelgood, Javier and all
> > others).
>
> Great news!
> Where is the documentation to apply the new changes on the index behaviour?
>
>
>
> --
> Javier de la Rosa
> http://versae.es
>
>
> --
>
> Message: 4
> Date: Mon, 20 Dec 2010 11:28:34 

Re: [Neo4j] User Digest, Vol 45, Issue 35

2011-01-12 Thread Dave Hesketh
eo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
>
>
>
> --
>
> Message: 2
> Date: Sat, 18 Dec 2010 00:48:00 +0100
> From: Craig Taverner 
> Subject: Re: [Neo4j] R-Tree indexing performance
> To: Neo4j user discussions 
> Message-ID:
>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi all,
>
> Yes, there are some plans for improvements to the index. However, I should
> start by saying that we have not done extensive benchmarking of the RTree
> against the PostGIS implementation, so the work done by Dave is very
> interesting and I would like to learn more about his test case. One thing
> that would be interesting to find out is whether the performance
> differences
> are due to the RTree implementation itself, or due to some other underlying
> geometry test code, JTS versus something else, or Java versus C.
>
> Peter points out that we currently have a search algorithm that does not
> perhaps make the best use of the graph, since it does not use traversals,
> but uses recursion and produces a result-set instead of a result stream. It
> is not completely clear all the ways this might affect performance, but it
> seems likely that two two cases we should see performance issues, large
> result sets and deep traversals. Moving the logic to a real traverser, as
> used in some other indexes we have tried, will resolve those issues. But,
> it
> is possible this has nothing to do with Dave's case.
>
> So in summary, I think there are a few areas that can account for these
> differences:
>
>   - General database performance, and I see others have answered with
>   suggestions on dealing with that. Neo4j is generally very fast, but
>   sometimes needs some tuning.
>   - The RTree implementation itself - I know RTree's are not all equal, so
>   there may be room for general RTree improvements and optimizations. As
>   mentioned we have not put much time into optimizing the RTree very much,
> so
>   hopefully there is room to move here.
>   - The search algorithm's known issues with not leveraging the Neo4j
>   traversal framework which is a very good, and high performance framework.
>
> Peter mentions a new multi-dimensional index I am working on, which I call
> a
> 'composite index'. I think this will not out-perform the RTree because it
> is
> targeting a very different data domain, primarily point data with large
> numbers of attributes to be indexed in the same index and queried with
> complex queries. For purely spatial queries, the RTree should perform much
> better. But for combined spatial and statistical queries, the new index
> should perform better. But there are a few tricks we are using to improve
> the performance of the composite index that might be reused for the RTree,
> but they require first porting it to the traversal framework, and then fine
> tuning the traversal performance. So, my preference is to complete the
> composite index, optimize it and then see if some of those optimizations
> can
> be ported to the RTree at the same time as moving the RTree to the
> traverser
> framework.
>
> Regards, Craig
>
> On Fri, Dec 17, 2010 at 6:15 PM, Peter Neubauer <
> peter.neuba...@neotechnology.com> wrote:
>
> > Dave,
> > Craig is planning to improve the R-Tree index in several ways:
> >
> > - introduce streaming instead of set based returns from the traversal
> > - work on generic multidimensional indexing.
> >
> > Craig, what do you say?
> >
> > Cheers,
> >
> > /peter neubauer
> >
> > GTalk:  neubauer.peter
> > Skype   peter.neubauer
> > Phone   +46 704 106975
> > LinkedIn   http://www.linkedin.com/in/neubauer
> > Twitter  http://twitter.com/peterneubauer
> >
> > http://www.neo4j.org   - Your high performance graph
> database.
> > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
> >
> >
> >
> > On Fri, Dec 17, 2010 at 3:43 PM, Dave Hesketh
> >  wrote:
> > > I'm currently comparing the performance of R-Tree indexing in Neo4j
> with
> > > PostGIS/PostgreSQL. The database and index has been created and
> searched
> > in
> > > Neo4j using Davide Savazzi routines : ShapefileImported and
> SearchWithin.
> > > The test dataset is 28,000 points (clustered around San Franciso and
> > > Vancouver) and the search is for the points within 10

[Neo4j] R-Tree indexing performance

2010-12-17 Thread Dave Hesketh
I'm currently comparing the performance of R-Tree indexing in Neo4j with
PostGIS/PostgreSQL. The database and index has been created and searched in
Neo4j using Davide Savazzi routines : ShapefileImported and SearchWithin.
The test dataset is 28,000 points (clustered around San Franciso and
Vancouver) and the search is for the points within 1000 randomly generated
'circles' (ie 16 sided polygons). On average, each search in Neo4j takes 4
times longer than in PostGIS. Now I know the processing is working correctly
I want to progressively increase the number of points to 10,000,000.
Can anybody give me advice/tips on improving the performance in Neo4j before
I start scaling-up the test? At this stage, I am only interested in the
search performance.
Neo4j Version: 1.2.M05
Environment: Windows 7, i5 64bit processor, quad core 4GB
Thanks Dave
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo] -path option on Shell command on Windows XP

2010-03-07 Thread Dave Hesketh
Hi
I can successfully enable the remote server (using ... GraphDatabaseService
graphDb = new EmbeddedGraphDatabase( "D:\\Neo-4j\\MyDatabase" ); . etc) and
then start the Shell at the Dos command prompt: java -jar
neo4j-shell-1.0-rc.jar. But if I follow the documentation and use the -path
option i.e. enable the remote server (using ... GraphDatabaseService graphDb
= new EmbeddedGraphDatabase( "graphdb" ); . etc) and start the Shell at the
Dos command prompt: java -jar neo4j-shell-1.0-rc.jar -path
d:\Neo-4j\MyDataBase, I get an exception: Exception in thread "main"
java.lang.NoClassDefFoundError: javax\transaction\TransactionManager. In
this case, if I leave out the -path option, the Shell starts ok but I am not
linked to any database.
What am I doing wrong?
Thanks Dave
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user