[Neo4j] Neo4j startup script syntax errors

2011-07-04 Thread Paul Bandler
When invoking the bin/neo4j command on solaris the following error messages are 
display and the script halts :

TEST:bandlerp@us2187$ ./neo4j: line 37: syntax error in conditional expression: 
unexpected token `('
./neo4j: line 37: syntax error near `^(['
./neo4j: line 37: `  if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then'

Have not investigated - thought dev team might be interested ...

Sent from my iPhone
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j startup script syntax errors

2011-07-04 Thread Paul Bandler
I'm running the script as shipped which already has #!/bin/bash on line 1?

Sent from my iPhone

On 4 Jul 2011, at 10:02, Michael Hunger michael.hun...@neotechnology.com 
wrote:

 Which shell is that?
 
 Could you try to change #!/bin/sh to #!/bin/bash for a test?
 
 Thanks a lot
 
 Michael
 
 Am 04.07.2011 um 10:57 schrieb Paul Bandler:
 
 When invoking the bin/neo4j command on solaris the following error messages 
 are display and the script halts :
 
 TEST:bandlerp@us2187$ ./neo4j: line 37: syntax error in conditional 
 expression: unexpected token `('
 ./neo4j: line 37: syntax error near `^(['
 ./neo4j: line 37: `  if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then'
 
 Have not investigated - thought dev team might be interested ...
 
 Sent from my iPhone
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j startup script syntax errors

2011-07-04 Thread Paul Bandler
Here are the details if the bash version which I assume is the standard 
distribution on this solaris platform .

TEST:bandlerp@us2187$ /bin/bash -version
GNU bash, version 3.00.16(1)-release (sparc-sun-solaris2.10)
Copyright (C) 2004 Free Software Foundation, Inc.

Sent from my iPhone

On 4 Jul 2011, at 10:50, Paul Bandler pband...@cseuk.co.uk wrote:

 I'm running the script as shipped which already has #!/bin/bash on line 1?
 
 Sent from my iPhone
 
 On 4 Jul 2011, at 10:02, Michael Hunger michael.hun...@neotechnology.com 
 wrote:
 
 Which shell is that?
 
 Could you try to change #!/bin/sh to #!/bin/bash for a test?
 
 Thanks a lot
 
 Michael
 
 Am 04.07.2011 um 10:57 schrieb Paul Bandler:
 
 When invoking the bin/neo4j command on solaris the following error messages 
 are display and the script halts :
 
 TEST:bandlerp@us2187$ ./neo4j: line 37: syntax error in conditional 
 expression: unexpected token `('
 ./neo4j: line 37: syntax error near `^(['
 ./neo4j: line 37: `  if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then'
 
 Have not investigated - thought dev team might be interested ...
 
 Sent from my iPhone
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Does anyone use Solaris for neo4j app? [was Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-07-03 Thread Paul Bandler
I want to re-do these just to be sure, but what I think I've observed is as 
follows...

I've done 2 tests using ram disks, one was an application fragment that I've 
referred to before - performs extensive traversal of the graph.  Once the cache 
had become hot the performance was good for each iteration, but periodically (I 
assume when garbage collection kicked in) the impact of that garbage collection 
was significantly more with ram disk configuration than without it.

The second test was to see how fast nodes could be looked up by an indexed 'id' 
value (not their node id, but a domain id integer value. I have one particular 
domain entity of which there are about 3.5m instances, all indexed by id.  I 
created a test that read these id's from the origin SQL database into memory 
then iterated thru' each value and retrieved the corresponding node from neo4j 
via the index.  All the neo4j graphdb directory structure had been copied to a 
ram disk area so I had expected the test to be cpu bound, but watching it with 
jconsole it had very low cpu utilisation and its performance wasn't much better 
(maybe 10%) than when the graphdb directory structure had been on a local disk. 
 On a local disk the test retrieved about 12k nodes per second, with the ram 
disk it retrieved approximately 13k nodes per second.

BTW, I'm starting to think that my own API's around neo4j might be adding 
significant overhead - not because they do much but that I wrap fly-weight 
wrapper objects around each node as I retrieve each node then often discard it. 
 I notice on jconsole that the heap graph looks shows a very regular pattern of 
rapid heap growth followed by reduction - I suspect that reflecting an 
accumulation of these very short lived wrapper objects that then get garbage 
collected.  This reflects the approach and is perhaps exacerbated for queries 
that need to traverse the object graph.  Rather than use the neo4j traversal 
framework I'm tendiing to build simple accessors methos on each type for 
traversing to their neighbour types, then building queries atop of these that 
traverse the domain model in terms of the lower level accessors.  This is 
attractive in terms of reducing coupling to neo4j, but I think might be the 
cause of lots of very short-lived fly-weight objects and collections of them 
 being created and discarded any comment - is this an anti-pattern and a 
rationale to use the neo4j traversal framework mechanism?

On 3 Jul 2011, at 12:47, Michael Hunger wrote:

 Paul,
 
 could you provide some details which kinds of perf-tests don't perform on a 
 RAM disk?
 
 Thanks Michael 
 
 Sent from my iBrick4
 
 
 Am 02.07.2011 um 00:55 schrieb Paul Bandler pband...@cseuk.co.uk:
 
 
 On 30 Jun 2011, at 23:42, Michael Hunger wrote:
 
 I would love to look into the memory mapping issues on Solaris.
 
 There is no issue - just user mis-understanding of how it works - or doesn't 
 if ones database is located on an NFS file-system.  A little documentation 
 on the necessary preconditions would be in order perhaps.
 
 Still have some other performance issues I don't fully understand - like why 
 when locating the database on a ram disk certain performance tests don'g 
 seem to run any quicker and exhibit low cpu utilisation as if they're still 
 I/O bound, but that will have to wait for next week.  
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Does anyone use Solaris for neo4j app? [was Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-07-01 Thread Paul Bandler

On 30 Jun 2011, at 23:42, Michael Hunger wrote:

 I would love to look into the memory mapping issues on Solaris.

There is no issue - just user mis-understanding of how it works - or doesn't if 
ones database is located on an NFS file-system.  A little documentation on the 
necessary preconditions would be in order perhaps.

Still have some other performance issues I don't fully understand - like why 
when locating the database on a ram disk certain performance tests don'g seem 
to run any quicker and exhibit low cpu utilisation as if they're still I/O 
bound, but that will have to wait for next week.  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-06-30 Thread Paul Bandler
When running a test neo4j application on Solaris that I have previously run 
successfully on Windows I'm encountering the following exception:

Caused by: java.io.IOException: Resource temporarily unavailable
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:747)
at 
org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:53)

Indeed this in  turn is causing a huge knock-on effect but I've not included 
that stack trace for clarity.  The program appears to attempt to continue, 
albeit at a snails pace.

I'm running using the latest 1.4 milestone release, with a maximum heap of 2G, 
and defaulting all other store parameters.

Any suggestions as to what is happening would be most welcome.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-06-30 Thread Paul Bandler
Further to an earlier posting I also notice this warning from neo4j which 
sounds relevant..? Note that the application does continue and eventually 
complete successfully having taken orders of magnitude longer than when run on 
windows. This is running on solaris  -  any suggestions about what could be 
causing this behaviour most welcome . Are there  issues with using neo4j on 
Solaris?

Jun 30, 2011 10:58:59 AM 
org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool logWarn
WARNING: [./neo4j-advanced-1.4.M05/data/graph.db/neostore.relationshipstore.db] 
Unable to memory map
org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map pos=28593 
recordSize=33 totalSize=104841
at 
org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:61)
at 
org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:603)
at 
org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:501)
at 
org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:128)
at 
org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:526)
at 
org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:327)
at 
org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:114)
at 
org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:97)
at 
org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:108)
at 
org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:604)
at 
org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:403)
at 
org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:98)
at 
com.nomura.smo.vcs.rdm.neo4j.AbstractBaseNeo4j.getSectorRelations(AbstractBaseNeo4j.java:42)

Sent from my iPhone

On 30 Jun 2011, at 07:40, Paul Bandler pband...@cseuk.co.uk wrote:

 When running a test neo4j application on Solaris that I have previously run 
 successfully on Windows I'm encountering the following exception:
 
 Caused by: java.io.IOException: Resource temporarily unavailable
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:747)
at 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:53)
 
 Indeed this in  turn is causing a huge knock-on effect but I've not included 
 that stack trace for clarity.  The program appears to attempt to continue, 
 albeit at a snails pace.
 
 I'm running using the latest 1.4 milestone release, with a maximum heap of 
 2G, and defaulting all other store parameters.
 
 Any suggestions as to what is happening would be most welcome.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-06-30 Thread Paul Bandler
A colleague has speculated that it maybe related to permissions. Im  on a 
shared solaris box with no setuid access - can anyone elaborate on whether some 
specific access rights are required to use memory mapped IO?

Sent from my iPhone

On 30 Jun 2011, at 11:20, Paul Bandler pband...@cseuk.co.uk wrote:

 Further to an earlier posting I also notice this warning from neo4j which 
 sounds relevant..? Note that the application does continue and eventually 
 complete successfully having taken orders of magnitude longer than when run 
 on windows. This is running on solaris  -  any suggestions about what could 
 be causing this behaviour most welcome . Are there  issues with using neo4j 
 on Solaris?
 
 Jun 30, 2011 10:58:59 AM 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool logWarn
 WARNING: 
 [./neo4j-advanced-1.4.M05/data/graph.db/neostore.relationshipstore.db] Unable 
 to memory map
 org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map 
 pos=28593 recordSize=33 totalSize=104841
at 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:61)
at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:603)
at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:501)
at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:128)
at 
 org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:526)
at 
 org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:327)
at 
 org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:114)
at 
 org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:97)
at 
 org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:108)
at 
 org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:604)
at 
 org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:403)
at 
 org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:98)
at 
 com.nomura.smo.vcs.rdm.neo4j.AbstractBaseNeo4j.getSectorRelations(AbstractBaseNeo4j.java:42)
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 07:40, Paul Bandler pband...@cseuk.co.uk wrote:
 
 When running a test neo4j application on Solaris that I have previously run 
 successfully on Windows I'm encountering the following exception:
 
 Caused by: java.io.IOException: Resource temporarily unavailable
   at sun.nio.ch.FileChannelImpl.map0(Native Method)
   at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:747)
   at 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:53)
 
 Indeed this in  turn is causing a huge knock-on effect but I've not included 
 that stack trace for clarity.  The program appears to attempt to continue, 
 albeit at a snails pace.
 
 I'm running using the latest 1.4 milestone release, with a maximum heap of 
 2G, and defaulting all other store parameters.
 
 Any suggestions as to what is happening would be most welcome.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-06-30 Thread Paul Bandler
TEST:bandlerp@us2187$ uname -X
System = SunOS
Node = us2187
Release = 5.10
KernelID = Generic_13-03
Machine = sun4u
BusType = unknown
Serial = unknown
Users = unknown
OEM# = 0
Origin# = 1
NumCPU = 16

Sent from my iPhone

On 30 Jun 2011, at 12:53, Michael Hunger michael.hun...@neotechnology.com 
wrote:

 Paul,
 
 what version of Solaris are you running that on? We don't have Solaris as 
 part of our build-qa workflow (yet). So I would try to see if there is an ec2 
 instance that I could just use for that.
 
 Cheers
 
 Michael
 
 Am 30.06.2011 um 13:26 schrieb Paul Bandler:
 
 A colleague has speculated that it maybe related to permissions. Im  on a 
 shared solaris box with no setuid access - can anyone elaborate on whether 
 some specific access rights are required to use memory mapped IO?
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 11:20, Paul Bandler pband...@cseuk.co.uk wrote:
 
 Further to an earlier posting I also notice this warning from neo4j which 
 sounds relevant..? Note that the application does continue and eventually 
 complete successfully having taken orders of magnitude longer than when run 
 on windows. This is running on solaris  -  any suggestions about what could 
 be causing this behaviour most welcome . Are there  issues with using neo4j 
 on Solaris?
 
 Jun 30, 2011 10:58:59 AM 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool logWarn
 WARNING: 
 [./neo4j-advanced-1.4.M05/data/graph.db/neostore.relationshipstore.db] 
 Unable to memory map
 org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map 
 pos=28593 recordSize=33 totalSize=104841
  at 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:61)
  at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:603)
  at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:501)
  at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:128)
  at 
 org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:526)
  at 
 org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:327)
  at 
 org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:114)
  at 
 org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:97)
  at 
 org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:108)
  at 
 org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:604)
  at 
 org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:403)
  at 
 org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:98)
  at 
 com.nomura.smo.vcs.rdm.neo4j.AbstractBaseNeo4j.getSectorRelations(AbstractBaseNeo4j.java:42)
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 07:40, Paul Bandler pband...@cseuk.co.uk wrote:
 
 When running a test neo4j application on Solaris that I have previously 
 run successfully on Windows I'm encountering the following exception:
 
 Caused by: java.io.IOException: Resource temporarily unavailable
 at sun.nio.ch.FileChannelImpl.map0(Native Method)
 at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:747)
 at 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:53)
 
 Indeed this in  turn is causing a huge knock-on effect but I've not 
 included that stack trace for clarity.  The program appears to attempt to 
 continue, albeit at a snails pace.
 
 I'm running using the latest 1.4 milestone release, with a maximum heap of 
 2G, and defaulting all other store parameters.
 
 Any suggestions as to what is happening would be most welcome.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Does anyone use Solaris for neo4j app? [was Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-06-30 Thread Paul Bandler
Further to the problems reported below from running a test application on a 
server class Solaris box (M4000)  I re-ran the test but with the setting 
'use_memory_mapped_buffers' set to false.  This had the +ve effect of avoiding 
the warnings and nio exceptions mentioned before so clearly there's some 
additional magic required to run under Solaris using memory mapped buffers.  I 
have enquired as to whether one needs setuid rights to use memory mapped files 
on Solaris and am told not btw...

The performance was woeful however - like walking thru' molasses in (a Swedish) 
mid-winter!  The test comprises a relatively complex navigation of a domain 
model to derive some information.  This is a re-implementation of a function  
that  already exists in a production C++ application that uses an object 
database (based on a product called ObjectStore) -  so I have something to 
compare the results against.  I also have the results of running the same neo4j 
test on a low-end Windows PC - here are the comparative timings.  Each test 
contains 2 executions of exactly the same function with the idea that the first 
execution will be the worst case scenario as the caches are stone-cold, but 
then the second execution, being exactly the same traversals, should be the 
best case scenario with the cache's red-hot...  In all cases I'm using an 
Embedded neo4j database approach.

Windows (3.2Ghz PC with 1Gb of heap, neo4j database on local disk):-
- execution 1 ~ 41seconds
- execution 21.6 seconds

Solaris (Sun SPARC Enterprise M4000 Server System clock frequency: 1012 MHz 
Memory size: 32768 Megabytes with 3.5Gb heap; NEO4J database on nfs mounted 
filesystem...)
- execution 1 ~ 27+ _minutes_
- execution 2 ~ 5.5 seconds

Re-run with neo4j database on a local file system
- execution 1  ~20 seconds
- execution 2   1.56 seconds


Existing C++/ObjectStore implementation on same Solaris box as above:-
- execution 1 ~ 3.2 seconds
- execution 2 ~ 0.1 seconds

These results are disappointing.  It appears that the first execution is 
extremely I/O bound with a local enterprise server disk only improving on a 
low-end Windows PC by a factor of 2, and nfs reducing performance by a factor 
of about x80.  Once warmed, the low-end Windows PC is almost on par withe 
enterprise Solaris performance.  The fastest neo4j execution 2 is 16x slower 
than the existing C++/Objectstore implementation.   

Any suggestions on how it might be possible to narrow the gap between the 
performance of C++/Objectstore implementation and the neo4j one would be most 
welcome, as otherwise this will be the end of the line for this approach  I 
can accept the poor comparison of the first run if once the cache's are hot the 
second run comes somewhere reasonably comparable to what might expect from a 
comparison between a C++ and Java implementation of a CPU intensive application 
operation (say 2-3 times slower in Java...)?

cheers,

Paul
On 30 Jun 2011, at 14:06, Paul Bandler wrote:

 TEST:bandlerp@us2187$ uname -X
 System = SunOS
 Node = us2187
 Release = 5.10
 KernelID = Generic_13-03
 Machine = sun4u
 BusType = unknown
 Serial = unknown
 Users = unknown
 OEM# = 0
 Origin# = 1
 NumCPU = 16
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 12:53, Michael Hunger michael.hun...@neotechnology.com 
 wrote:
 
 Paul,
 
 what version of Solaris are you running that on? We don't have Solaris as 
 part of our build-qa workflow (yet). So I would try to see if there is an 
 ec2 instance that I could just use for that.
 
 Cheers
 
 Michael
 
 Am 30.06.2011 um 13:26 schrieb Paul Bandler:
 
 A colleague has speculated that it maybe related to permissions. Im  on a 
 shared solaris box with no setuid access - can anyone elaborate on whether 
 some specific access rights are required to use memory mapped IO?
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 11:20, Paul Bandler pband...@cseuk.co.uk wrote:
 
 Further to an earlier posting I also notice this warning from neo4j which 
 sounds relevant..? Note that the application does continue and eventually 
 complete successfully having taken orders of magnitude longer than when 
 run on windows. This is running on solaris  -  any suggestions about what 
 could be causing this behaviour most welcome . Are there  issues with 
 using neo4j on Solaris?
 
 Jun 30, 2011 10:58:59 AM 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool logWarn
 WARNING: 
 [./neo4j-advanced-1.4.M05/data/graph.db/neostore.relationshipstore.db] 
 Unable to memory map
 org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map 
 pos=28593 recordSize=33 totalSize=104841
 at 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:61)
 at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:603)
 at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:501

Re: [Neo4j] Does anyone use Solaris for neo4j app? [was Unable to memory map [was java.io.IOException: Resource temporarily unavailable in MappedPersistenceWindow.java?

2011-06-30 Thread Paul Bandler
Oops - it seems it might not be neo4j that's chewing up the CPUby making my 
log4j calls disabled with 'if(_log.isDebugEnabled)' pre-conditions to avoid the 
expense of constructing the strings that might or might get logged, I was able 
to improve the times for the 'hot-cache' tests described below by about x10, 
bringing it within a factor of 1.5-2 of the C++ performance which is within 
acceptable limits.

I'm still interested to know whether others are using neo4j on Solaris, whether 
it's possible to use memory mapped i/o for buffers on Solaris, and indeed 
whether that would further help performance, or does it just reduce the heap 
demand?

On 30 Jun 2011, at 21:02, Paul Bandler wrote:

 Further to the problems reported below from running a test application on a 
 server class Solaris box (M4000)  I re-ran the test but with the setting 
 'use_memory_mapped_buffers' set to false.  This had the +ve effect of 
 avoiding the warnings and nio exceptions mentioned before so clearly there's 
 some additional magic required to run under Solaris using memory mapped 
 buffers.  I have enquired as to whether one needs setuid rights to use memory 
 mapped files on Solaris and am told not btw...
 
 The performance was woeful however - like walking thru' molasses in (a 
 Swedish) mid-winter!  The test comprises a relatively complex navigation of a 
 domain model to derive some information.  This is a re-implementation of a 
 function  that  already exists in a production C++ application that uses an 
 object database (based on a product called ObjectStore) -  so I have 
 something to compare the results against.  I also have the results of running 
 the same neo4j test on a low-end Windows PC - here are the comparative 
 timings.  Each test contains 2 executions of exactly the same function with 
 the idea that the first execution will be the worst case scenario as the 
 caches are stone-cold, but then the second execution, being exactly the same 
 traversals, should be the best case scenario with the cache's red-hot...  In 
 all cases I'm using an Embedded neo4j database approach.
 
 Windows (3.2Ghz PC with 1Gb of heap, neo4j database on local disk):-
 - execution 1 ~ 41seconds
 - execution 21.6 seconds
 
 Solaris (Sun SPARC Enterprise M4000 Server System clock frequency: 1012 MHz 
 Memory size: 32768 Megabytes with 3.5Gb heap; NEO4J database on nfs mounted 
 filesystem...)
 - execution 1 ~ 27+ _minutes_
 - execution 2 ~ 5.5 seconds
 
 Re-run with neo4j database on a local file system
 - execution 1  ~20 seconds
 - execution 2   1.56 seconds
 
 
 Existing C++/ObjectStore implementation on same Solaris box as above:-
 - execution 1 ~ 3.2 seconds
 - execution 2 ~ 0.1 seconds
 
 These results are disappointing.  It appears that the first execution is 
 extremely I/O bound with a local enterprise server disk only improving on a 
 low-end Windows PC by a factor of 2, and nfs reducing performance by a factor 
 of about x80.  Once warmed, the low-end Windows PC is almost on par withe 
 enterprise Solaris performance.  The fastest neo4j execution 2 is 16x slower 
 than the existing C++/Objectstore implementation.   
 
 Any suggestions on how it might be possible to narrow the gap between the 
 performance of C++/Objectstore implementation and the neo4j one would be most 
 welcome, as otherwise this will be the end of the line for this approach  
 I can accept the poor comparison of the first run if once the cache's are hot 
 the second run comes somewhere reasonably comparable to what might expect 
 from a comparison between a C++ and Java implementation of a CPU intensive 
 application operation (say 2-3 times slower in Java...)?
 
 cheers,
 
 Paul
 On 30 Jun 2011, at 14:06, Paul Bandler wrote:
 
 TEST:bandlerp@us2187$ uname -X
 System = SunOS
 Node = us2187
 Release = 5.10
 KernelID = Generic_13-03
 Machine = sun4u
 BusType = unknown
 Serial = unknown
 Users = unknown
 OEM# = 0
 Origin# = 1
 NumCPU = 16
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 12:53, Michael Hunger michael.hun...@neotechnology.com 
 wrote:
 
 Paul,
 
 what version of Solaris are you running that on? We don't have Solaris as 
 part of our build-qa workflow (yet). So I would try to see if there is an 
 ec2 instance that I could just use for that.
 
 Cheers
 
 Michael
 
 Am 30.06.2011 um 13:26 schrieb Paul Bandler:
 
 A colleague has speculated that it maybe related to permissions. Im  on a 
 shared solaris box with no setuid access - can anyone elaborate on whether 
 some specific access rights are required to use memory mapped IO?
 
 Sent from my iPhone
 
 On 30 Jun 2011, at 11:20, Paul Bandler pband...@cseuk.co.uk wrote:
 
 Further to an earlier posting I also notice this warning from neo4j which 
 sounds relevant..? Note that the application does continue and eventually 
 complete successfully having taken orders of magnitude longer than when 
 run on windows. This is running on solaris  -  any suggestions about what 
 could

[Neo4j] Moving a Neo4j Database?

2011-06-29 Thread Paul Bandler
Is there some record within the Neo4J database or its indexing that retains 
knowledge of the location where it was created?  Only I tried coping a neo4j 
database from a Windows drive to a Unix file system and started a Unix neo4j 
application pointing to it and while it connected apparently ok, when it tried 
to find its first index it failed to find it, even though I can clearly see the 
relevant Index files in its filesystem area.  Also I notice in the neo4j 
startup that it reports its 'store_dir' property as referencing the Windows 
drive and directory from whence it originated
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Advantages of EmbeddedReadOnlyGraphDatabase ? [was NonWritableChannelException]

2011-06-28 Thread Paul Bandler
Are there advantages in accessing the database via the 
EmbeddedReadOnlyGraphDatabaseclass in a read-only application?  

A week or so back I posted the message below regarding a problem experienced 
while using EmbeddedReadOnlyGraphDatabase that regrettably didn't solicit any 
responses and since then I've been using the standard read-write 
EmbeddedGraphDatabase without repeat of the same issue even though my 
application is read-only. Are there any avoidable performance penalties using 
EmbeddedGraphDatabase in place of EmbeddedReadOnlyGraphDatabase?

Along a similar lines, as my application is read-only, I'm not doing any 
explicit transaction management - is there any reason why I should?



Begin forwarded message:

 From: Paul Bandler pband...@cseuk.co.uk
 Date: 21 June 2011 12:22:56 GMT+01:00
 To: Neo4j user discussions user@lists.neo4j.org
 Subject: [Neo4j] NonWritableChannelException
 Reply-To: Neo4j user discussions user@lists.neo4j.org
 
 The above exception is thrown from the call stack indicated below while 
 traversing a neo4j graph using the EmbededReadOnly database. Using 1.4M04.
 
 The application is running with 1gb of heap with defaulting all other 
 parameters except cache_type=weak on windows. 
 
 I found some reports of this exception being thrown at shutdown back 
 in January but this is not happening at shutdown and I could find no posted 
 resolution of that thread anyway.
 
 Can anyone suggest what the cause if this exception is?
 
 Thanks
 
 Paul
 
 
 
 Exception in thread main java.nio.channels.NonWritableChannelException
at sun.nio.ch.FileChannelImpl.write(Unknown Source)
at 
 org.neo4j.kernel.impl.nioneo.store.AbstractPersistenceWindow.writeOut(AbstractPersistenceWindow.java:104)
at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:536)
at 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:128)
at 
 org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:526)
at 
 org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:327)
at 
 org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:114)
at 
 org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:97)
at 
 org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:108)
at 
 org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:603)
at 
 org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:399)
at 
 org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:93)
at 
 org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:218)
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Advantages of EmbeddedReadOnlyGraphDatabase ? [was NonWritableChannelException]

2011-06-28 Thread Paul Bandler
The problem resulting in the NonWritableChannelException occurred on Windows 
although I need to run on Solaris as well as Windows.

Actually I must apologise for saying that the previously reported issue didn't 
solicit a response - I see that Johan Svensson did just today and suggests it 
might be fixed in the new milestone release.

My questions regarding what actual effect using the 
EmbeddedReadOnlyGraphDatabase is still relevant however...?

On 28 Jun 2011, at 21:21, Rick Bullotta wrote:

 Paul, are you on windows or Linux?
 
 - Reply message -
 From: Paul Bandler pband...@cseuk.co.uk
 Date: Tue, Jun 28, 2011 1:34 pm
 Subject: [Neo4j] Advantages of EmbeddedReadOnlyGraphDatabase ? [was 
 NonWritableChannelException]
 To: Neo4j user discussions user@lists.neo4j.org
 
 Are there advantages in accessing the database via the 
 EmbeddedReadOnlyGraphDatabaseclass in a read-only application?  
 
 A week or so back I posted the message below regarding a problem experienced 
 while using EmbeddedReadOnlyGraphDatabase that regrettably didn't solicit any 
 responses and since then I've been using the standard read-write 
 EmbeddedGraphDatabase without repeat of the same issue even though my 
 application is read-only. Are there any avoidable performance penalties using 
 EmbeddedGraphDatabase in place of EmbeddedReadOnlyGraphDatabase?
 
 Along a similar lines, as my application is read-only, I'm not doing any 
 explicit transaction management - is there any reason why I should?
 
 
 
 Begin forwarded message:
 
  From: Paul Bandler pband...@cseuk.co.uk
  Date: 21 June 2011 12:22:56 GMT+01:00
  To: Neo4j user discussions user@lists.neo4j.org
  Subject: [Neo4j] NonWritableChannelException
  Reply-To: Neo4j user discussions user@lists.neo4j.org
  
  The above exception is thrown from the call stack indicated below while 
  traversing a neo4j graph using the EmbededReadOnly database. Using 1.4M04.
  
  The application is running with 1gb of heap with defaulting all other 
  parameters except cache_type=weak on windows. 
  
  I found some reports of this exception being thrown at shutdown back 
  in January but this is not happening at shutdown and I could find no posted 
  resolution of that thread anyway.
  
  Can anyone suggest what the cause if this exception is?
  
  Thanks
  
  Paul
  
  
  
  Exception in thread main java.nio.channels.NonWritableChannelException
 at sun.nio.ch.FileChannelImpl.write(Unknown Source)
 at 
  org.neo4j.kernel.impl.nioneo.store.AbstractPersistenceWindow.writeOut(AbstractPersistenceWindow.java:104)
 at 
  org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:536)
 at 
  org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:128)
 at 
  org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:526)
 at 
  org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:327)
 at 
  org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:114)
 at 
  org.neo4j.kernel.impl.nioneo.xa.ReadTransaction.getMoreRelationships(ReadTransaction.java:97)
 at 
  org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:108)
 at 
  org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:603)
 at 
  org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:399)
 at 
  org.neo4j.kernel.impl.core.IntArrayIterator.hasNext(IntArrayIterator.java:93)
 at 
  org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:218)
  
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Webadmin browser dependencies?

2011-06-23 Thread Paul Bandler
Unable to access the data tab using IE version 7. Tried using the Eclipse built 
in browser and it momentarily switches to the data tab but then it disappears 
again.

What are the supported browser platforms?

Sent from my iPhone
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Webadmin browser dependencies?

2011-06-23 Thread Paul Bandler
Thanks but none of those browsers are allowed to be downloaded within my 
clients corporate network...

Again, does anyone know what IE level works?

Sent from my iPhone

On 23 Jun 2011, at 13:53, Tatham Oddie tat...@oddie.com.au wrote:

 Data browser works in Chome / Firefox / Safari.
 
 It should give you a message to this effect on unsupported browsers.
 
 
 -- Tatham
 
 
 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
 Behalf Of Paul Bandler
 Sent: Thursday, 23 June 2011 10:30 PM
 To: Neo4j user discussions
 Subject: [Neo4j] Webadmin browser dependencies?
 
 Unable to access the data tab using IE version 7. Tried using the Eclipse 
 built in browser and it momentarily switches to the data tab but then it 
 disappears again.
 
 What are the supported browser platforms?
 
 Sent from my iPhone
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] BatchInserter improvement with 1.4M04 but still got relationship building bottleneck [was Re: Speeding up initial import of graph...]

2011-06-13 Thread Paul Bandler
. Took 2687
190 nodes created. Took 2969
200 nodes created. Took 2891
Creating nodes took 61
MY_SIZE: 12
CompactNodeIndex slot count: 200
10 relationships created. Took 311377
20 relationships created. Took 11297
30 relationships created. Took 11062
40 relationships created. Took 10891
50 relationships created. Took 11109
60 relationships created. Took 11375
70 relationships created. Took 11266
80 relationships created. Took 26469
90 relationships created. Took 46875
100 relationships created. Took 12047
110 relationships created. Took 43016
120 relationships created. Took 12110
130 relationships created. Took 12625
140 relationships created. Took 12031
150 relationships created. Took 40375
160 relationships created. Took 11328
170 relationships created. Took 11125
180 relationships created. Took 10891
190 relationships created. Took 11266
200 relationships created. Took 11125
210 relationships created. Took 11281
220 relationships created. Took 11156
230 relationships created. Took 11250
240 relationships created. Took 11735
250 relationships created. Took 15984
260 relationships created. Took 16766
270 relationships created. Took 71969
280 relationships created. Took 205283
290 relationships created. Took 159236
300 relationships created. Took 32734
310 relationships created. Took 149064
320 relationships created. Took 116391
330 relationships created. Took 74079
340 relationships created. Took 43360
350 relationships created. Took 20500
360 relationships created. Took 246704
370 relationships created. Took 74407
380 relationships created. Took 189611
390 relationships created. Took 44922
400 relationships created. Took 482675
Creating relationships took 2628

iMac (REPORT_COUNT = MILLION)
Physical mem: 4096MB, Heap size: 2039MB
use_memory_mapped_buffers=false
neostore.propertystore.db.index.keys.mapped_memory=1M
neostore.propertystore.db.strings.mapped_memory=106M
neostore.propertystore.db.arrays.mapped_memory=120M
neo_store=/Users/paulbandler/Documents/workspace/Neo4jImport/target/hepper/neostore
neostore.relationshipstore.db.mapped_memory=152M
neostore.propertystore.db.index.mapped_memory=1M
neostore.propertystore.db.mapped_memory=124M
dump_configuration=true
cache_type=weak
neostore.nodestore.db.mapped_memory=34M
100 nodes created. Took 2817 
200 nodes created. Took 2407 
300 nodes created. Took 2086 
400 nodes created. Took 2303 
500 nodes created. Took 2912 
600 nodes created. Took 2178 
700 nodes created. Took 2241 
800 nodes created. Took 2453 
900 nodes created. Took 2627 
1000 nodes created. Took 3996 
Creating nodes took 26
MY_SIZE: 12
CompactNodeIndex slot count: 1000
100 relationships created. Took 198784 
200 relationships created. Took 24203 
300 relationships created. Took 25313 
400 relationships created. Took 22177 
500 relationships created. Took 22406 
600 relationships created. Took 84977 
700 relationships created. Took 402123 
800 relationships created. Took 1342290 

 
On 10 Jun 2011, at 08:27, Michael Hunger wrote:

 You're right the lucene based import shouldn't fail for memory problems, I 
 will look into that.
 
 My suggestion is valid if you want to use an in memory map to speed up the 
 import. And if you're able to perhaps analyze / partition your data that 
 might be a viable solution.
 
 Will get back to you with the findings later.
 
 Michael
 
 Am 10.06.2011 um 09:02 schrieb Paul Bandler:
 
 
 On 9 Jun 2011, at 22:12, Michael Hunger wrote:
 
 Please keep in mind that the HashMap of 10M strings - longs will take a 
 substantial amount of heap memory.
 That's not the fault of Neo4j :) On my system it alone takes 1.8 G of 
 memory (distributed across the strings, the hashmap-entries and the longs).
 
 
 Fair enough,  but removing the Map and using the Index instead and setting 
 the cache_type to weak makes almost no difference to the programs behaviour 
 in terms of progressively consuming the heap until it fails.  I did this, 
 including removal of the allocation of the Map, and watched to heap 
 consumption follow a similar pattern until it failed as below.
 
 Or you should perhaps use an amazon ec2 instance which you can easily get 
 with up to 68 G of RAM :)
 
 With respect, and while I notice the smile, throwing memory at it is not an 
 option for a large set of enterprise applications that might actually be 
 willing to pay to use Neo4j if it didn't fail at the first hurdle when 
 confronted with a trivial and small scale data load...
 
 runImport failed after 2,072 seconds
 
 Creating data took 316 seconds
 Physical mem: 1535MB, Heap size: 1016MB
 use_memory_mapped_buffers=false
 neostore.propertystore.db.index.keys.mapped_memory=1M
 neostore.propertystore.db.strings.mapped_memory=52M

Re: [Neo4j] BatchInserter improvement with 1.4M04 but still got relationship building bottleneck [was Re: Speeding up initial import of graph...]

2011-06-13 Thread Paul Bandler
 and the CompactIndex you wrote?
 
 That would be great. 
 
 Also the memory settings (Xmx) you used for the different runs.
 
 Thanks so much
 
 Michael
 
 Am 13.06.2011 um 14:15 schrieb Paul Bandler:
 
 Having noticed a mention in the 1.4M04 release notes that:
 
 Also, the BatchInserterIndex now keeps its memory usage in-check with 
 batched commits of indexed data using a configurable batch commit size.
 
 I re-ran this test using M04 and sure enough, node creation no longer eats 
 up the heap linearly so that is good - I should be able to remove the 
 periodic resetting of the BatchInserter during import.
 
 So I returned to the issue of removing the index creation and later access 
 bottleneck using an application managed data structure as Michael 
 illustrated, but needing a solution with a smaller memory footprint I wrote 
 a CompactNodeIndex class for mapping integer 'id' key values to long nodes 
 that uses a minimum memory footprint by overlaying a binary-choppable table 
 onto a byte array.  Watching heap on jconsole while this ran I could see 
 that had the desired effect of releasing huge amounts of heap once it the 
 CompactNodeIndex is loaded and the source data structure gc'd.  However when 
 I attempted to scale the test program back up to the 10M nodes Michael had 
 been testing it appears to run into something of a brick wall becoming 
 massively I/O bound when creating the relationships.  With 1M nodes it ran 
 ok, with 2M nodes not too bad, but much beyond that it crawls along using 
 just about 1% of CPU but has loads of heap spare.
 
 I re-ran on a more generously configured iMac (giving the test 4G of heap) 
 and it did much better in that it actually showed some progress building 
 relationships over a 10M node-set, but still exhibited massive slow down 
 once past 7M relationships.
 
 Below are the test results - the question now is are there any Neo4j 
 parameters that might enable this I/O bottleneck that appears when building 
 relationships over such sized node sets with the BatchInserter...?  I note 
 the section in the manual on performance parameters, but I'm afraid not 
 being familiar enough with the Neo4j internals I don't feel that they give 
 enough clear information on how to set them improve the performance of this 
 use-case.
 
 Thanks,
 
 Paul
 
 Run 1 - Windows m/c..REPORT_COUNT = MILLION/10
 Physical mem: 1535MB, Heap size: 1016MB
 use_memory_mapped_buffers=false
 neostore.propertystore.db.index.keys.mapped_memory=1M
 neostore.propertystore.db.strings.mapped_memory=52M
 neostore.propertystore.db.arrays.mapped_memory=60M
 neo_store=N:\TradeModel\target\hepper\neostore
 neostore.relationshipstore.db.mapped_memory=76M
 neostore.propertystore.db.index.mapped_memory=1M
 neostore.propertystore.db.mapped_memory=62M
 dump_configuration=true
 cache_type=weak
 neostore.nodestore.db.mapped_memory=17M
 10 nodes created. Took 2906
 20 nodes created. Took 2688
 30 nodes created. Took 2828
 40 nodes created. Took 2953
 50 nodes created. Took 2672
 60 nodes created. Took 2766
 70 nodes created. Took 2687
 80 nodes created. Took 2703
 90 nodes created. Took 2719
 100 nodes created. Took 2641
 Creating nodes took 27
 MY_SIZE: 12
 CompactNodeIndex slot count: 100
 10 relationships created. Took 4125
 20 relationships created. Took 3953
 30 relationships created. Took 3937
 40 relationships created. Took 3610
 50 relationships created. Took 3719
 60 relationships created. Took 4328
 70 relationships created. Took 3750
 80 relationships created. Took 3609
 90 relationships created. Took 4125
 100 relationships created. Took 3781
 110 relationships created. Took 4125
 120 relationships created. Took 3750
 130 relationships created. Took 3907
 140 relationships created. Took 4297
 150 relationships created. Took 3703
 160 relationships created. Took 3687
 170 relationships created. Took 4328
 180 relationships created. Took 3907
 190 relationships created. Took 3718
 200 relationships created. Took 3891
 Creating relationships took 78
 
 2M Nodes on Windows m/c:-
 
 Creating data took 68 seconds
 Physical mem: 1535MB, Heap size: 1016MB
 use_memory_mapped_buffers=false
 neostore.propertystore.db.index.keys.mapped_memory=1M
 neostore.propertystore.db.strings.mapped_memory=52M
 neostore.propertystore.db.arrays.mapped_memory=60M
 neo_store=N:\TradeModel\target\hepper\neostore
 neostore.relationshipstore.db.mapped_memory=76M
 neostore.propertystore.db.index.mapped_memory=1M
 neostore.propertystore.db.mapped_memory=62M
 dump_configuration=true
 cache_type=weak
 neostore.nodestore.db.mapped_memory=17M
 10 nodes created. Took 3188
 20 nodes created. Took 3094
 30 nodes created. Took 3062
 40 nodes created. Took 2813
 50 nodes created. Took 2718
 60 nodes created. Took 3000
 70 nodes created. Took 2938
 80 nodes created. Took 2828
 90 nodes

Re: [Neo4j] Speeding up initial import of graph

2011-06-10 Thread Paul Bandler
 09.06.2011 um 22:36 schrieb Paul Bandler:
 
 I ran Michael’s  example test import program with the Map replacing the 
 index on my on more modestly configured machine to see whether the import 
 scaling problems I have reported previously using Batchinserter were 
 reproduced.  They were – I gave the program 1G of heap and watched it run 
 using jconsole.  It ran reasonably quickly as it consumed the in an almost 
 straight line until it neared its capacity then practically stopped for 
 about 20 minutes after which it died with an out of memory error – see below.
 
 Now I’m not saying that Neo4j should necessarily go out of its way to 
 support very memory constrained environments, but I do think that it is not 
 unreasonable to expect its batch import mechanism not to fall over in this 
 way but should rather flush its buffers or whatever without requiring the 
 import application writer to shut it down and restart it periodically...
 
 Creating data took 331 seconds
 100 nodes created. Took 29001
 200 nodes created. Took 35107
 300 nodes created. Took 35904
 400 nodes created. Took 66169
 500 nodes created. Took 63280
 600 nodes created. Took 183922
 700 nodes created. Took 258276
 
 com.nomura.smo.rdm.neo4j.restore.Hepper
 createData(330.364seconds)
 runImport (1,485 seconds later...)
 java.lang.OutOfMemoryError: Java heap space
   at java.util.ArrayList.init(Unknown Source)
   at java.util.ArrayList.init(Unknown Source)
   at 
 org.neo4j.kernel.impl.nioneo.store.PropertyRecord.init(PropertyRecord.java:33)
   at 
 org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.createPropertyChain(BatchInserterImpl.java:425)
   at 
 org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.createNode(BatchInserterImpl.java:143)
   at com.nomura.smo.rdm.neo4j.restore.Hepper.runImport(Hepper.java:61)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
   at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
   at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 
 
 Regards,
 Paul Bandler 
 On 9 Jun 2011, at 12:27, Michael Hunger wrote:
 
 I recreated Daniels code in Java, mainly because some things were missing 
 from his scala example.
 
 You're right that the index is the bottleneck. But with your small data set 
 it should be possible to cache the 10m nodes in a heap that fits in your 
 machine.
 
 I ran it first with the index and had about 8 seconds / 1M nodes and 320 
 sec/1M rels.
 
 Then I switched to 3G heap and a HashMap to keep the name=node lookup and 
 it went to 2s/1M nodes and 13 down-to 3 sec for 1M rels.
 
 That is the approach that Chris takes only that his solution can persist 
 the map to disk and is more efficient :)
 
 Hope that helps.
 
 Michael
 
 package org.neo4j.load;
 
 import org.apache.commons.io.FileUtils;
 import org.junit.Test;
 import org.neo4j.graphdb.RelationshipType;
 import org.neo4j.graphdb.index.BatchInserterIndex;
 import org.neo4j.graphdb.index.BatchInserterIndexProvider;
 import org.neo4j.helpers.collection.MapUtil;
 import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider;
 import org.neo4j.kernel.impl.batchinsert.BatchInserter;
 import

Re: [Neo4j] Speeding up initial import of graph

2011-06-09 Thread Paul Bandler
I too am experiencing similar problems - possibly worse than you're seeing as I 
am using a very modestly provisioned windows m/c (1.5Gb ram, setting max heap 
to 1Gb, oldish processor).

I found that using the BatchInserter for loading nodes the heap grew and grew 
until when it was exhausted everything ground to a halt practically.  I 
experimented with various settings of the cache memory but nothing made much 
difference. So now I reset the BatchInserter (i.e. shutdown and re-start it) 
ever 100,000 nodes or so.  I posted questions on the list before but the 
replies seemed to suggest that it was just a config issue - but no config 
changes I made helped much.   I get the impression that most people are using 
Neo4j with hugely larger memory footprints than I can realistically expect to 
use at this stage and so maybe that is why this problem may not receive much 
attention. 

I have a similar approach to you for relationships - i.e. creating them in a 
second pass.  I'm not sure how memory hungry it is, but again have built a 
class that resets the inserters every 100,000 relationships.  It is slow, but 
experimenting with my 'reset' size, didn't make much difference so I'm 
suspecting that its limited by index access time.  Effectively I suspect it's 
going to disk for every index look up that it sees for the first time, and also 
suspect that the size of the index might make a difference as I have over 3m 
nodes in some indexes and these are the ones that are very slow.

I suspect there might be some tuning that can be done, and I really think the 
problem with running out of heap is probably a bug that should be fixed, but am 
now turning my attention to finding ways of creating relationships when the 
initial nodes are created (at least for those for which this is possible) to 
avoid the index lookup overhead...

I'll let you know if/how this helps, but am also interested to learn of others 
experience.

On 9 Jun 2011, at 10:59, Daniel Hepper wrote:

 Hi all,
 
 I'm struggling with importing a graph with about 10m nodes and 20m
 relationships, with nodes having 0 to 10 relationships. Creating the
 nodes takes about 10 minutes, but creating the relationships is slower
 by several orders of magnitude. I'm using a 2.4 GHz i7 MacBookPro with
 4GB RAM and conventional HDD.
 
 The graph is stored as adjacency list in a text file where each line
 has this form:
 
 Foo|Bar|Baz
 (Node Foo has relations to Bar and Baz)
 
 My current approach is to iterate over the whole file twice. In the
 first run, I create a node with the property name for the first
 entry in the line (Foo in this case) and add it to an index.
 In the second run, I get the start node and the end nodes from the
 index by name and create the relationships.
 
 My code can be found here: http://pastie.org/2041801
 
 With my approach, the best I can achieve is 100 created relationships
 per second.
 I experimented with mapped memory settings, but without much effect.
 Is this the speed I can expect?
 Any advice on how to speed up this process?
 
 Best regards,
 Daniel Hepper
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Speeding up initial import of graph

2011-06-09 Thread Paul Bandler
I ran Michael’s  example test import program with the Map replacing the index 
on my on more modestly configured machine to see whether the import scaling 
problems I have reported previously using Batchinserter were reproduced.  They 
were – I gave the program 1G of heap and watched it run using jconsole.  It ran 
reasonably quickly as it consumed the in an almost straight line until it 
neared its capacity then practically stopped for about 20 minutes after which 
it died with an out of memory error – see below.
 
Now I’m not saying that Neo4j should necessarily go out of its way to support 
very memory constrained environments, but I do think that it is not 
unreasonable to expect its batch import mechanism not to fall over in this way 
but should rather flush its buffers or whatever without requiring the import 
application writer to shut it down and restart it periodically...
 
Creating data took 331 seconds
100 nodes created. Took 29001
200 nodes created. Took 35107
300 nodes created. Took 35904
400 nodes created. Took 66169
500 nodes created. Took 63280
600 nodes created. Took 183922
700 nodes created. Took 258276
 
com.nomura.smo.rdm.neo4j.restore.Hepper
createData(330.364seconds)
runImport (1,485 seconds later...)
java.lang.OutOfMemoryError: Java heap space
at java.util.ArrayList.init(Unknown Source)
at java.util.ArrayList.init(Unknown Source)
at 
org.neo4j.kernel.impl.nioneo.store.PropertyRecord.init(PropertyRecord.java:33)
at 
org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.createPropertyChain(BatchInserterImpl.java:425)
at 
org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.createNode(BatchInserterImpl.java:143)
at com.nomura.smo.rdm.neo4j.restore.Hepper.runImport(Hepper.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 
 
Regards,
Paul Bandler 
On 9 Jun 2011, at 12:27, Michael Hunger wrote:

 I recreated Daniels code in Java, mainly because some things were missing 
 from his scala example.
 
 You're right that the index is the bottleneck. But with your small data set 
 it should be possible to cache the 10m nodes in a heap that fits in your 
 machine.
 
 I ran it first with the index and had about 8 seconds / 1M nodes and 320 
 sec/1M rels.
 
 Then I switched to 3G heap and a HashMap to keep the name=node lookup and it 
 went to 2s/1M nodes and 13 down-to 3 sec for 1M rels.
 
 That is the approach that Chris takes only that his solution can persist the 
 map to disk and is more efficient :)
 
 Hope that helps.
 
 Michael
 
 package org.neo4j.load;
 
 import org.apache.commons.io.FileUtils;
 import org.junit.Test;
 import org.neo4j.graphdb.RelationshipType;
 import org.neo4j.graphdb.index.BatchInserterIndex;
 import org.neo4j.graphdb.index.BatchInserterIndexProvider;
 import org.neo4j.helpers.collection.MapUtil;
 import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider;
 import org.neo4j.kernel.impl.batchinsert.BatchInserter;
 import org.neo4j.kernel.impl.batchinsert.BatchInserterImpl

Re: [Neo4j] BatchInserter exhausting heap...?

2011-06-02 Thread Paul Bandler
I monitored the heap using jconsole and much to my surprise observed that the 
heap stayed relatively stable while the overall memory occupancy of the process 
grew steadily until it reached the ~500M . I'm now rather confused as to what 
else can be consuming memory like that Any ideas folks?

Sent from my iPhone

On 1 Jun 2011, at 20:52, Michael Hunger michael.hun...@neotechnology.com 
wrote:

 props passed in to the batchinserter
 look into messages log
 you see the different gc behaviour
 
 Michael
 
 Sent from my iBrick4
 
 
 Am 01.06.2011 um 20:44 schrieb Paul Bandler pband...@cseuk.co.uk:
 
 Is that simply set as a system property or via the Map passed as the second 
 parameter to the BatchInserterImpl constructor?  I've tried both and doesn't 
 seem to help.  Is there some way I can verify that it's being used?
 
 I'm using 1.3 
 
 On 1 Jun 2011, at 18:49, Michael Hunger wrote:
 
 you could use cache_type=weak
 in the db properties
 
 you can easily introspect java programs (heap) using jmap jconsole or 
 visualvm
 
 what version of neo4j are you using?
 
 index.flush just sets a flag for immediate index querying
 
 Sent from my iBrick4
 
 
 Am 01.06.2011 um 19:18 schrieb Paul Bandler pband...@cseuk.co.uk:
 
 I have a simple program that uses the BatchInserter to load rows from a 
 SQL database and am running it on a modestly configured Windows machine 
 with 2GB of RAM and setting the max heap to 500M.
 
 Initially it was running out of memory quite soon so I introduced a flush 
 after every 5000 nodes and it appeared that all was well.  But having got 
 further in the data load it appears to hop along nicely but the memory 
 allocated (simply visible using windows task manager) grows and grows 
 until I suspect it's reached its max heap size and it's written about 2M 
 nodes then abruptly stops making any further discernible progress.  It 
 doesn't fail, just the logging I've put in to log every 5000 nodes has 
 stopped and the CPU is 100% used - garbage collecting I suspect.
 
 Is there something I should be doing periodically in addition to the
 index flush to stop the heap exhaustion?  My code is really simple, here's 
 the method for loading nodes from each table:-
 
 public long restoreCollection() {
resolveSql();
_log.debug(restore collection: + getCollectionName() +  using: 
+ _sql +  and: + Arrays.deepToString(_columns));
final BatchInserterIndex _index = makeIndex();
final long collectionNode = _inserter.createNode(MapUtil.map(name,
getCollectionName()));
 
_log.debug(Query db...);
getJdbcTemplate().query(_sql, new Object[] {},
new RowCallbackHandler() {
public void processRow(ResultSet row) throws SQLException {
final MapString, Object properties = 
 extractproperties(row);
long node = _inserter.createNode(properties);
_inserter.createRelationship(node, collectionNode,
RdmRelationship.MEMBER_OF, null);
if (_index != null)
for (DbColumn col : _columns) {
if (col.isIndexed())
_index.add(node, MapUtil.map(col.getName(),
properties.get(col.getName(;
}
_collectionSize++;
if ((_collectionSize % FLUSH_INTERVAL == 0)) {
if (_index != null)
_index.flush();
_log.debug(Added node: + _collectionSize
+  to:  + getCollectionName());
}
}
});
 
// long collectionNode = -1;
if (_index != null) {
_index.flush();
}
_log.debug(Completed restoring  + _collectionSize +  to: 
+ getCollectionName());
return collectionNode;
 }
 
 
 and then around that a higher level function that handles all tables:-
 
 public void run() {
throwIfNull(_restorers, Restorers missing);
throwIfNull(_inserter, Batch inserter missing);
int totalNodes = 0;
int totalRelationships = 0;
try {
for (CollectionRestorer r : _restorers) {
long collection = r.restoreCollection();
totalNodes += r.getCollectionSize();
_inserter.createRelationship(_inserter.getGraphDbService()
.getReferenceNode().getId(), collection,
RdmRelationship.CLASS_EXTENT, null);
}
for (ParentChildRelationshipBuilder r : _relators) {
r.makeRelationships();
totalRelationships += r.getRelations();
 
}
} finally {
_inserter.shutdown();
_log.info(Batch inserter shutdown.  Created:  + totalNodes +  
 nodes and 
+ totalRelationships +  relationships);
}
 }
 
 Any suggestions welcome

[Neo4j] BatchInserter exhausting heap...?

2011-06-01 Thread Paul Bandler
I have a simple program that uses the BatchInserter to load rows from a SQL 
database and am running it on a modestly configured Windows machine with 2GB of 
RAM and setting the max heap to 500M.

Initially it was running out of memory quite soon so I introduced a flush after 
every 5000 nodes and it appeared that all was well.  But having got further in 
the data load it appears to hop along nicely but the memory allocated (simply 
visible using windows task manager) grows and grows until I suspect it's 
reached its max heap size and it's written about 2M nodes then abruptly stops 
making any further discernible progress.  It doesn't fail, just the logging 
I've put in to log every 5000 nodes has stopped and the CPU is 100% used - 
garbage collecting I suspect.

Is there something I should be doing periodically in addition to theindex 
flush to stop the heap exhaustion?  My code is really simple, here's the method 
for loading nodes from each table:-

  public long restoreCollection() {
resolveSql();
_log.debug(restore collection: + getCollectionName() +  using: 
+ _sql +  and: + Arrays.deepToString(_columns));
final BatchInserterIndex _index = makeIndex();
final long collectionNode = _inserter.createNode(MapUtil.map(name,
getCollectionName()));
 
_log.debug(Query db...);
getJdbcTemplate().query(_sql, new Object[] {},
new RowCallbackHandler() {
public void processRow(ResultSet row) throws SQLException {
final MapString, Object properties = 
extractproperties(row);
long node = _inserter.createNode(properties);
_inserter.createRelationship(node, collectionNode,
RdmRelationship.MEMBER_OF, null);
if (_index != null)
for (DbColumn col : _columns) {
if (col.isIndexed())
_index.add(node, MapUtil.map(col.getName(),
properties.get(col.getName(;
}
_collectionSize++;
if ((_collectionSize % FLUSH_INTERVAL == 0)) {
if (_index != null)
_index.flush();
_log.debug(Added node: + _collectionSize
+  to:  + getCollectionName());
}
}
});
 
// long collectionNode = -1;
if (_index != null) {
_index.flush();
}
_log.debug(Completed restoring  + _collectionSize +  to: 
+ getCollectionName());
return collectionNode;
}
 

and then around that a higher level function that handles all tables:-

public void run() {
throwIfNull(_restorers, Restorers missing);
throwIfNull(_inserter, Batch inserter missing);
int totalNodes = 0;
int totalRelationships = 0;
try {
for (CollectionRestorer r : _restorers) {
long collection = r.restoreCollection();
totalNodes += r.getCollectionSize();
_inserter.createRelationship(_inserter.getGraphDbService()
.getReferenceNode().getId(), collection,
RdmRelationship.CLASS_EXTENT, null);
}
for (ParentChildRelationshipBuilder r : _relators) {
r.makeRelationships();
totalRelationships += r.getRelations();
 
}
} finally {
_inserter.shutdown();
_log.info(Batch inserter shutdown.  Created:  + totalNodes +  
nodes and 
+ totalRelationships +  relationships);
}
}
 
Any suggestions welcome.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] BatchInserter exhausting heap...?

2011-06-01 Thread Paul Bandler
Is that simply set as a system property or via the Map passed as the second 
parameter to the BatchInserterImpl constructor?  I've tried both and doesn't 
seem to help.  Is there some way I can verify that it's being used?

I'm using 1.3 

On 1 Jun 2011, at 18:49, Michael Hunger wrote:

 you could use cache_type=weak
 in the db properties
 
 you can easily introspect java programs (heap) using jmap jconsole or visualvm
 
 what version of neo4j are you using?
 
 index.flush just sets a flag for immediate index querying
 
 Sent from my iBrick4
 
 
 Am 01.06.2011 um 19:18 schrieb Paul Bandler pband...@cseuk.co.uk:
 
 I have a simple program that uses the BatchInserter to load rows from a SQL 
 database and am running it on a modestly configured Windows machine with 2GB 
 of RAM and setting the max heap to 500M.
 
 Initially it was running out of memory quite soon so I introduced a flush 
 after every 5000 nodes and it appeared that all was well.  But having got 
 further in the data load it appears to hop along nicely but the memory 
 allocated (simply visible using windows task manager) grows and grows until 
 I suspect it's reached its max heap size and it's written about 2M nodes 
 then abruptly stops making any further discernible progress.  It doesn't 
 fail, just the logging I've put in to log every 5000 nodes has stopped and 
 the CPU is 100% used - garbage collecting I suspect.
 
 Is there something I should be doing periodically in addition to the
 index flush to stop the heap exhaustion?  My code is really simple, here's 
 the method for loading nodes from each table:-
 
 public long restoreCollection() {
  resolveSql();
  _log.debug(restore collection: + getCollectionName() +  using: 
  + _sql +  and: + Arrays.deepToString(_columns));
  final BatchInserterIndex _index = makeIndex();
  final long collectionNode = _inserter.createNode(MapUtil.map(name,
  getCollectionName()));
 
  _log.debug(Query db...);
  getJdbcTemplate().query(_sql, new Object[] {},
  new RowCallbackHandler() {
  public void processRow(ResultSet row) throws SQLException {
  final MapString, Object properties = 
 extractproperties(row);
  long node = _inserter.createNode(properties);
  _inserter.createRelationship(node, collectionNode,
  RdmRelationship.MEMBER_OF, null);
  if (_index != null)
  for (DbColumn col : _columns) {
  if (col.isIndexed())
  _index.add(node, MapUtil.map(col.getName(),
  properties.get(col.getName(;
  }
  _collectionSize++;
  if ((_collectionSize % FLUSH_INTERVAL == 0)) {
  if (_index != null)
  _index.flush();
  _log.debug(Added node: + _collectionSize
  +  to:  + getCollectionName());
  }
  }
  });
 
  // long collectionNode = -1;
  if (_index != null) {
  _index.flush();
  }
  _log.debug(Completed restoring  + _collectionSize +  to: 
  + getCollectionName());
  return collectionNode;
  }
 
 
 and then around that a higher level function that handles all tables:-
 
  public void run() {
  throwIfNull(_restorers, Restorers missing);
  throwIfNull(_inserter, Batch inserter missing);
  int totalNodes = 0;
  int totalRelationships = 0;
  try {
  for (CollectionRestorer r : _restorers) {
  long collection = r.restoreCollection();
  totalNodes += r.getCollectionSize();
  _inserter.createRelationship(_inserter.getGraphDbService()
  .getReferenceNode().getId(), collection,
  RdmRelationship.CLASS_EXTENT, null);
  }
  for (ParentChildRelationshipBuilder r : _relators) {
  r.makeRelationships();
  totalRelationships += r.getRelations();
 
  }
  } finally {
  _inserter.shutdown();
  _log.info(Batch inserter shutdown.  Created:  + totalNodes +  
 nodes and 
  + totalRelationships +  relationships);
  }
  }
 
 Any suggestions welcome.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] NEO4J -Newbie Q - How best to query a typed object graph...

2011-05-24 Thread Paul Bandler

 Folks,
 
 I’m trying to use Neo4J to implement a typed object model of enterprise data 
 loaded from an RDB.  I’m a little unclear how best to go about building 
 ‘queries’  to retrieve certain node ‘types’ but with ‘where’ clauses that 
 reference properties of related nodes of other types.  Is the traversal 
 framework intended to be used for this as I get the impression that it’s more 
 intended for traversing a graph of more or less homogeneously typed nodes?  I 
 have contained each node ‘type’ into its own ‘collection’ node and indexed 
 them by my own ‘id’, so it is the best option to ‘simply’ iterate thru’ the 
 entire collection (or subset based on indexed retrieval) of the target type 
 and write code to evaluate each node against my criteria, traversing the 
 relationships out from the node as necessary?
 
 Regards,
 
 Paul Bandler
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Newbie question: Can a Neo4j HA cluster host embedded applications

2011-05-17 Thread Paul Bandler

 Hi,
 
 I’m just reading up on NEO4J and doing some initial experimentation with its 
 API’s.  Now I understand it can be embedded within an application or used as 
 a server or server HA/cluster, but is it possible to use it in a HA/cluster 
 configuration with an application instance hosted on each cluster server 
 instance such that it accesses NEO4J using its local API’s?  Perhaps along 
 similar lines is it possible for the web admin tool be used to browse an 
 embedded Neo4j database?
 
Regards,

Paul Bandler
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Newbie question: Can a Neo4j HA cluster host embedded applications

2011-05-17 Thread Paul Bandler
Perfect - I hoped as much and thought I'd seen something like that on the 
slides of your Skills Matter podcast. Are those slides available for download  
as I can't find them on the Skills Matter site?

Sent from my iPhone

On 17 May 2011, at 11:24, Jim Webber j...@neotechnology.com wrote:

 Hi Paul,
 
 Neo4j server is just a remote API around the database engine with a useful 
 (and pretty) Web admin tool. It doesn't fundamentally change the database, 
 since the database is still embedded though it's embedded in our process 
 rather than yours.
 
 Which means you can run HA whether your database instances are embedded 
 within your process or within our process (aka the server). The HA protocol 
 doesn't care - it only cares that it can connect to the right ports on the 
 instances in the cluster.
 
 HTH,
 
 Jim
 
 On 17 May 2011, at 11:11, Paul Bandler wrote:
 
 
 Hi,
 
 I’m just reading up on NEO4J and doing some initial experimentation with 
 its API’s.  Now I understand it can be embedded within an application or 
 used as a server or server HA/cluster, but is it possible to use it in a 
 HA/cluster configuration with an application instance hosted on each 
 cluster server instance such that it accesses NEO4J using its local API’s?  
 Perhaps along similar lines is it possible for the web admin tool be used 
 to browse an embedded Neo4j database?
 
 Regards,
 
 Paul Bandler
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Color suggestions for the Self-Relationship bike shed

2011-05-16 Thread Paul Bandler
+1 for 1

Sent from my iPhone

On 16 May 2011, at 14:32, Rick Bullotta rick.bullo...@thingworx.com wrote:

 +1 for option 1.
 
 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
 Behalf Of Tobias Ivarsson
 Sent: Monday, May 16, 2011 8:12 AM
 To: Neo user discussions
 Subject: [Neo4j] Color suggestions for the Self-Relationship bike shed
 
 There has been a few discussions around supporting Loops, relationships with
 the same start node as end node, recently. Both here on the mailing list,
 and even more at the Neo Technology office.
 
 We have a working patch for handling loops in Neo4j, but one final piece is
 missing: what should the API look like for creating a loop? There are three
 suggestions for this, I'll list them with the pros and cons we've found for
 each. It would be great if you could provide some feedback on what you think
 on the matter, and which alternative you prefer.
 
 The alternatives:
 
 1. let the existing createRelationshipTo(Node,RelationshipType) create loops
 if the same node is passed as argument.
 2. add a new createLoop(RelationshipType) method for creating loops.
 3. add a createRelationshipOrLoop(Node,RelationshipType) method that would
 work like createRelationshipTo, except it would permit creating loops.
 
 
 The pros and cons:
 
 PRO 1: does not add a new method to the API that developers have to learn.
 
 CON 1: changes the semantics of the createRelationshipTo method slightly
 from what it is today.
 
 CON 1: will not help you catch programming errors where you've mistakenly
 written code that creates a relationship to the same node (most of the cases
 where code creates relationships to the wrong node it is to the same node).
 
 PRO 2: will let you be explicit of when creating a loop.
 
 PRO 2: will let createRelationshipTo preserve the semantics it has today.
 Which will help catching many create relationships to the wrong node cases.
 
 CON 2: will force you to be explicit about loops, most applications that
 wan't loops will just treat them as any relationship where the start node
 and end node will just happen to be the same.
 
 PRO 3: adds loops as a generic construct (start and end just happens to be
 the same) without changing the current semantics of createRelationshipTo
 
 CON 3: Introduces a new method that creates relationships between any two
 nodes.
 
 
 It would of course be possible to go with both 2 and 3, and I think option 3
 makes more sense as an addition to option 2, rather than as an alternative
 to it.
 
 What do you think?
 -- 
 Tobias Ivarsson tobias.ivars...@neotechnology.com
 Hacker, Neo Technology
 www.neotechnology.com
 Cellphone: +46 706 534857
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user