Hi John, thanks a lot for reporting back.
Would you mind creating a GH issue (if possible reproducible with a minimal test)? Do you get a clean shutdown (db.shutdown() for your program when creating the data? I haven't seen an error with that kind of property on recovery. Does the recovery error also happen when you open the db again from your java program? Thanks so much, Michael On Sat, Aug 13, 2016 at 6:25 PM, John Fry <[email protected]> wrote: > Hi Michael, using arrays to store the properties solved the performance > issues as you suggested. The application is completing 10x faster easily > > > BUT it creates another problem. From the pseudo-code below I see the > following behaviour: > > > - when the array lcm contains 16 test values (all -1.0f) the > application runs at performance and I can open the db (via neo4j-shell) and > see the relationship have 16 x -1.0fs stored in a property array > - when lcm contains real and different values (e.g. 16 random floats) > the application runs at performance BUT the .db won't open in neo4j-shell - > it fails with the exception show below > - if i limit the size of the lcm array to 2 or 4 real/random floats > then it works > > I am guessing the property stores are compressed or something? > > > Regards, John. > > > > > public class ScoredLink { > > long id; > > float[] lcm = new float[16]; > > ..........etc > > > public static void main(String[] args) > > // ...do the math and score the 200M links local, in-memory > > // open the neo4j db > > // create batches of 500 relationships/links to write back > > // push the batches into a thread pool > > // for each thread.... > > try ( Transaction tx = db.beginTx() ) { > > for (int i=start; i<=end; i++) { // i.e. start-end=500 > > ScoredLink sl = scoredLinks.get(i); > > Relationship l = db.getRelationshipById(sl.id); > > l.setProperty("lwa_lcm", sl.lcm); //all 16 lcm vals > > } > > } > > tx.success(); > > tx.close(); > ..........etc > > > > ubuntu@ip-172-31-3-11:/opt/RAI/bin$ sudo neo4j-shell -v -path > /opt/neo4j/data/graph.db/ > ERROR (-v for expanded information): > Error starting org.neo4j.kernel.impl.factory.CommunityFacadeFactory, > /opt/neo4j/data/graph.db > java.lang.RuntimeException: Error starting > org.neo4j.kernel.impl.factory.CommunityFacadeFactory, > /opt/neo4j/data/graph.db > at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade( > GraphDatabaseFacadeFactory.java:143) > at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFacade( > CommunityFacadeFactory.java:43) > at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade( > GraphDatabaseFacadeFactory.java:108) > at org.neo4j.graphdb.factory.GraphDatabaseFactory.newDatabase( > GraphDatabaseFactory.java:129) > at org.neo4j.graphdb.factory.GraphDatabaseFactory$1.newDatabase( > GraphDatabaseFactory.java:117) > at org.neo4j.graphdb.factory.GraphDatabaseBuilder.newGraphDatabase( > GraphDatabaseBuilder.java:185) > at org.neo4j.shell.kernel.GraphDatabaseShellServer.instantiateGraphDb( > GraphDatabaseShellServer.java:203) > at org.neo4j.shell.kernel.GraphDatabaseShellServer.<init>( > GraphDatabaseShellServer.java:66) > at org.neo4j.shell.StartClient.getGraphDatabaseShellServer( > StartClient.java:282) > at org.neo4j.shell.StartClient.tryStartLocalServerAndClient( > StartClient.java:259) > at org.neo4j.shell.StartClient.startLocal(StartClient.java:247) > at org.neo4j.shell.StartClient.start(StartClient.java:180) > at org.neo4j.shell.StartClient.main(StartClient.java:135) > Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component > 'org.neo4j.kernel.recovery.Recovery@10c38489' failed to initialize. > Please see attached cause exception. > at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance. > init(LifeSupport.java:434) > at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:66) > at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:102) > at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:600) > at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance. > start(LifeSupport.java:452) > at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111) > at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start( > DataSourceManager.java:112) > at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance. > start(LifeSupport.java:452) > at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111) > at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade( > GraphDatabaseFacadeFactory.java:139) > ... 12 more > Caused by: java.lang.IllegalArgumentException: Unknown entry type 7 for > version 0. At position LogPosition{logVersion=170, byteOffset=16} and entry > version V1_9 > at org.neo4j.kernel.impl.transaction.log.entry. > LogEntryVersion.entryParser(LogEntryVersion.java:207) > at org.neo4j.kernel.impl.transaction.log.entry.VersionAwareLogEntryReader. > readLogEntry(VersionAwareLogEntryReader.java:92) > at org.neo4j.kernel.impl.transaction.log.LogEntryCursor.next( > LogEntryCursor.java:54) > at org.neo4j.kernel.recovery.LatestCheckPointFinder.find( > LatestCheckPointFinder.java:77) > at org.neo4j.kernel.recovery.PositionToRecoverFrom.apply( > PositionToRecoverFrom.java:53) > at org.neo4j.kernel.recovery.DefaultRecoverySPI.getPositionToRecoverFrom( > DefaultRecoverySPI.java:135) > at org.neo4j.kernel.recovery.Recovery.init(Recovery.java:72) > at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance. > init(LifeSupport.java:424) > ... 21 more > > -host Domain name or IP of host to connect to (default: localhost) > -port Port of host to connect to (default: 1337) > -name RMI name, i.e. rmi://<host>:<port>/<name> (default: shell) > -pid Process ID to connect to > -c Command line to execute. After executing it the shell exits > -file File containing commands to execute, or '-' to read from > stdin. After executing it the shell exits > -readonly Connect in readonly mode (only for connecting with -path) > -path Points to a neo4j db path so that a local server can be > started there > -config Points to a config file when starting a local server > > Example arguments for remote: > -port 1337 > -host 192.168.1.234 -port 1337 -name shell > -host localhost -readonly > ...or no arguments for default values > Example arguments for local: > -path /path/to/db > -path /path/to/db -config /path/to/neo4j.config > -path /path/to/db -readonly > > > > On Tuesday, August 9, 2016 at 1:32:21 PM UTC-7, Michael Hunger wrote: >> >> Oh sorry, I might have misunderstood you. >> >> Do you see the performance issue when creating the data or when accessing >> it? >> >> Could you share your graph-creation code? >> >> M >> >> On Tue, Aug 9, 2016 at 3:51 PM, John Fry <[email protected]> wrote: >> >>> Hi Michael, thanks... >>> >>> some more background info on the queries: >>> * note I am using neo ver 2.2 (I guess I should finally upgrade to 3+) >>> * everything I do is via the java api >>> * the queries are traversals and expansions: >>> --- I walk the graph node to node selecting each node by a function of >>> the weight vectors >>> --- I expand around a node to a depth on n for both incoming and >>> outgoing directions >>> --- I commonly use shortest path using dijkstra with my own cost >>> evaluators that use the weight vectors >>> --- once I have a reliable way to write all the properties I will use >>> the graph exclusively in 'read-only' mode. I only write the properties as >>> part of a graph creation process which is a single event usage - fast and >>> predictable creation of course is nice to achieve. >>> >>> I turn of transaction logging with: keep_logical_logs=false. >>> >>> Let me try using an integer array as a single property and see how that >>> performs. >>> >>> Thanks, John. >>> >>> >>> On Tuesday, August 9, 2016 at 3:35:50 AM UTC-7, Michael Hunger wrote: >>>> >>>> Hi John, >>>> >>>> which kind of "transaction logging did you turn off" ? >>>> >>>> Would you be able to share the queries you are using? >>>> >>>> each double property takes 8 bytes of storage in the property-record >>>> (which are linked in a chain, each property-record can hold up to 4 >>>> 4-byte-storage properties). >>>> >>>> But arrays are optimized, esp. if you have small values in your weights >>>> it tries to use only the significant bits to encode values in an array (but >>>> I think it might only do that for integer values). >>>> >>>> Would you be able to run a test where instead of having 5-10 individual >>>> properties you just use an array with that many entries? >>>> >>>> And perhaps even better project the the floating point values to >>>> integer values in that array. >>>> >>>> I also ask our kernel engineers for other tips in this regard. >>>> >>>> HTH, >>>> >>>> Michael >>>> >>>> On Mon, Aug 8, 2016 at 6:56 PM, John Fry <[email protected]> wrote: >>>> >>>>> Hello Michael, >>>>> >>>>> the graph is used as follows: >>>>> >>>>> - ~10M nodes; ~200M relationships >>>>> - Each relationship requires multiple floating properties that can >>>>> be considered connecting strength weights. These multiple weights make >>>>> up a >>>>> weight vector - upto ~20 weights per vector >>>>> - The weights on the relationship are static (or at least they >>>>> rarely change) >>>>> - The weight vector is used to compute custom (very algorithmic in >>>>> nature) costs per link to drive node-to-node traversals, expansions >>>>> and to >>>>> find cost based n-shortest paths >>>>> - The costs per link are calculated in as close to real time as >>>>> possible and are always different and are never stored or written back >>>>> to >>>>> the relationships in the graph >>>>> >>>>> Regards, John. >>>>> >>>>> On Monday, August 8, 2016 at 12:12:13 AM UTC-7, Michael Hunger wrote: >>>>>> >>>>>> Hi John, >>>>>> >>>>>> Do you have more details on the properties that you add as well as >>>>>> your graph model and queries? Without these details it will be hard to >>>>>> help. >>>>>> >>>>>> It sounds a bit as if your property heavy relationships might be >>>>>> nodes in hiding. >>>>>> >>>>>> Cheers Michael >>>>>> >>>>>> >>>>>> Von meinem iPhone gesendet >>>>>> >>>>>> Am 08.08.2016 um 06:05 schrieb John Fry <[email protected]>: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> In ne04j 2.3 what / where are the limits when storing properties on >>>>>> relationships? >>>>>> >>>>>> I have a graph with about 200M relationships and for each >>>>>> relationship I want to add floating point attributes as properties. >>>>>> Here is what I am experiencing: >>>>>> >>>>>> - adding 2 properties per rel - all works fine; very good >>>>>> performance >>>>>> - adding 5 properties per rel - start to see exceptions/crashes - >>>>>> can be fixed by turning off transaction logging - good performance >>>>>> - adding ~7 properties per rel - performance dramatically fades >>>>>> (10x slower) - occasional exceptions/crashes >>>>>> - adding ~10 properties per real - performance stalls/stops - >>>>>> eventually will crash >>>>>> >>>>>> What is a realistic set of expectations for storing this many >>>>>> properties where the relationship store could easily exceed > 20GB? >>>>>> >>>>>> Regards and thanks for any advice, John. >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Neo4j" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
