Hi John, Thanks a lot for the detailed feedback.
GH issues is bugtracking on GitHub: https://github.com/neo4j/neo4j/issues Please report the NaN issues (which might be not so easy to resolve). And more importantly the need for disabling the transaction-log (which is actually the write-ahead-log which takes care of recovery of already committed transactions after machine / program crashes) There was no out of disk issue I presume? You can configure the transaction log to be either disabled or to only hold a certain amount or for a certain number of days. Michael On Sun, Aug 14, 2016 at 5:07 AM, John Fry <[email protected]> wrote: > Hi Michael, here is an update > > - I found that I was writing NaNs into the array which caused the > exception in yellow. I fixed this with a simple trap: > > if (Float.isFinite(lcm)) > > sl.lcm[idx] = lcm; > > else > > sl.lcm[idx] = -1.0f; > > - When this was fixed I could write to all 200M relationships and the > db would open in the no4j-shell - BUT it would flag and exception when > exiting & closing. i.e. it wouldn't do a clean shut down. > - By turning off all transaction logging the db now opens and > closes without issues and all 200M 16 element float arrays are > successfully > written. > > So, to get this working I have to trap for NaNs and turn off transaction > logging then my app will write all 200M property arrays and open and close > cleanly in neo4j-shell. > I haven't yet tried opening it from a java app - I will let you know if I > have issues. > > What is a GH issues and how do I file one? > > Rgds, John > > > > > On Saturday, August 13, 2016 at 11:32:44 AM UTC-7, Michael Hunger wrote: >> >> Hi John, >> >> thanks a lot for reporting back. >> >> Would you mind creating a GH issue (if possible reproducible with a >> minimal test)? >> >> Do you get a clean shutdown (db.shutdown() for your program when creating >> the data? >> >> I haven't seen an error with that kind of property on recovery. >> Does the recovery error also happen when you open the db again from your >> java program? >> >> Thanks so much, >> >> Michael >> >> >> >> On Sat, Aug 13, 2016 at 6:25 PM, John Fry <[email protected]> wrote: >> >>> Hi Michael, using arrays to store the properties solved the performance >>> issues as you suggested. The application is completing 10x faster easily >>> >>> >>> BUT it creates another problem. From the pseudo-code below I see the >>> following behaviour: >>> >>> >>> - when the array lcm contains 16 test values (all -1.0f) the >>> application runs at performance and I can open the db (via neo4j-shell) >>> and >>> see the relationship have 16 x -1.0fs stored in a property array >>> - when lcm contains real and different values (e.g. 16 random >>> floats) the application runs at performance BUT the .db won't open in >>> neo4j-shell - it fails with the exception show below >>> - if i limit the size of the lcm array to 2 or 4 real/random floats >>> then it works >>> >>> I am guessing the property stores are compressed or something? >>> >>> >>> Regards, John. >>> >>> >>> >>> >>> public class ScoredLink { >>> >>> long id; >>> >>> float[] lcm = new float[16]; >>> >>> ..........etc >>> >>> >>> public static void main(String[] args) >>> >>> // ...do the math and score the 200M links local, in-memory >>> >>> // open the neo4j db >>> >>> // create batches of 500 relationships/links to write back >>> >>> // push the batches into a thread pool >>> >>> // for each thread.... >>> >>> try ( Transaction tx = db.beginTx() ) { >>> >>> for (int i=start; i<=end; i++) { // i.e. start-end=500 >>> >>> ScoredLink sl = scoredLinks.get(i); >>> >>> Relationship l = db.getRelationshipById(sl.id); >>> >>> l.setProperty("lwa_lcm", sl.lcm); //all 16 lcm vals >>> >>> } >>> >>> } >>> >>> tx.success(); >>> >>> tx.close(); >>> ..........etc >>> >>> >>> >>> ubuntu@ip-172-31-3-11:/opt/RAI/bin$ sudo neo4j-shell -v -path >>> /opt/neo4j/data/graph.db/ >>> ERROR (-v for expanded information): >>> Error starting org.neo4j.kernel.impl.factory.CommunityFacadeFactory, >>> /opt/neo4j/data/graph.db >>> java.lang.RuntimeException: Error starting >>> org.neo4j.kernel.impl.factory.CommunityFacadeFactory, >>> /opt/neo4j/data/graph.db >>> at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.new >>> Facade(GraphDatabaseFacadeFactory.java:143) >>> at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFaca >>> de(CommunityFacadeFactory.java:43) >>> at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.new >>> Facade(GraphDatabaseFacadeFactory.java:108) >>> at org.neo4j.graphdb.factory.GraphDatabaseFactory.newDatabase(G >>> raphDatabaseFactory.java:129) >>> at org.neo4j.graphdb.factory.GraphDatabaseFactory$1.newDatabase >>> (GraphDatabaseFactory.java:117) >>> at org.neo4j.graphdb.factory.GraphDatabaseBuilder.newGraphDatab >>> ase(GraphDatabaseBuilder.java:185) >>> at org.neo4j.shell.kernel.GraphDatabaseShellServer.instantiateG >>> raphDb(GraphDatabaseShellServer.java:203) >>> at org.neo4j.shell.kernel.GraphDatabaseShellServer.<init>(Graph >>> DatabaseShellServer.java:66) >>> at org.neo4j.shell.StartClient.getGraphDatabaseShellServer(Star >>> tClient.java:282) >>> at org.neo4j.shell.StartClient.tryStartLocalServerAndClient(Sta >>> rtClient.java:259) >>> at org.neo4j.shell.StartClient.startLocal(StartClient.java:247) >>> at org.neo4j.shell.StartClient.start(StartClient.java:180) >>> at org.neo4j.shell.StartClient.main(StartClient.java:135) >>> Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component >>> 'org.neo4j.kernel.recovery.Recovery@10c38489' failed to initialize. >>> Please see attached cause exception. >>> at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.ini >>> t(LifeSupport.java:434) >>> at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:66) >>> at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:102) >>> at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource >>> .java:600) >>> at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.sta >>> rt(LifeSupport.java:452) >>> at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111) >>> at org.neo4j.kernel.impl.transaction.state.DataSourceManager. >>> start(DataSourceManager.java:112) >>> at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.sta >>> rt(LifeSupport.java:452) >>> at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111) >>> at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.new >>> Facade(GraphDatabaseFacadeFactory.java:139) >>> ... 12 more >>> Caused by: java.lang.IllegalArgumentException: Unknown entry type 7 for >>> version 0. At position LogPosition{logVersion=170, byteOffset=16} and entry >>> version V1_9 >>> at org.neo4j.kernel.impl.transaction.log.entry.LogEntryVersion. >>> entryParser(LogEntryVersion.java:207) >>> at org.neo4j.kernel.impl.transaction.log.entry.VersionAwareLogE >>> ntryReader.readLogEntry(VersionAwareLogEntryReader.java:92) >>> at org.neo4j.kernel.impl.transaction.log.LogEntryCursor.next(Lo >>> gEntryCursor.java:54) >>> at org.neo4j.kernel.recovery.LatestCheckPointFinder.find(Latest >>> CheckPointFinder.java:77) >>> at org.neo4j.kernel.recovery.PositionToRecoverFrom.apply(Positi >>> onToRecoverFrom.java:53) >>> at org.neo4j.kernel.recovery.DefaultRecoverySPI.getPositionToRe >>> coverFrom(DefaultRecoverySPI.java:135) >>> at org.neo4j.kernel.recovery.Recovery.init(Recovery.java:72) >>> at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.ini >>> t(LifeSupport.java:424) >>> ... 21 more >>> >>> -host Domain name or IP of host to connect to (default: localhost) >>> -port Port of host to connect to (default: 1337) >>> -name RMI name, i.e. rmi://<host>:<port>/<name> (default: shell) >>> -pid Process ID to connect to >>> -c Command line to execute. After executing it the shell exits >>> -file File containing commands to execute, or '-' to read from >>> stdin. After executing it the shell exits >>> -readonly Connect in readonly mode (only for connecting with -path) >>> -path Points to a neo4j db path so that a local server can be >>> started there >>> -config Points to a config file when starting a local server >>> >>> Example arguments for remote: >>> -port 1337 >>> -host 192.168.1.234 -port 1337 -name shell >>> -host localhost -readonly >>> ...or no arguments for default values >>> Example arguments for local: >>> -path /path/to/db >>> -path /path/to/db -config /path/to/neo4j.config >>> -path /path/to/db -readonly >>> >>> >>> >>> On Tuesday, August 9, 2016 at 1:32:21 PM UTC-7, Michael Hunger wrote: >>>> >>>> Oh sorry, I might have misunderstood you. >>>> >>>> Do you see the performance issue when creating the data or when >>>> accessing it? >>>> >>>> Could you share your graph-creation code? >>>> >>>> M >>>> >>>> On Tue, Aug 9, 2016 at 3:51 PM, John Fry <[email protected]> wrote: >>>> >>>>> Hi Michael, thanks... >>>>> >>>>> some more background info on the queries: >>>>> * note I am using neo ver 2.2 (I guess I should finally upgrade to 3+) >>>>> * everything I do is via the java api >>>>> * the queries are traversals and expansions: >>>>> --- I walk the graph node to node selecting each node by a function of >>>>> the weight vectors >>>>> --- I expand around a node to a depth on n for both incoming and >>>>> outgoing directions >>>>> --- I commonly use shortest path using dijkstra with my own cost >>>>> evaluators that use the weight vectors >>>>> --- once I have a reliable way to write all the properties I will use >>>>> the graph exclusively in 'read-only' mode. I only write the properties as >>>>> part of a graph creation process which is a single event usage - fast and >>>>> predictable creation of course is nice to achieve. >>>>> >>>>> I turn of transaction logging with: keep_logical_logs=false. >>>>> >>>>> Let me try using an integer array as a single property and see how >>>>> that performs. >>>>> >>>>> Thanks, John. >>>>> >>>>> >>>>> On Tuesday, August 9, 2016 at 3:35:50 AM UTC-7, Michael Hunger wrote: >>>>>> >>>>>> Hi John, >>>>>> >>>>>> which kind of "transaction logging did you turn off" ? >>>>>> >>>>>> Would you be able to share the queries you are using? >>>>>> >>>>>> each double property takes 8 bytes of storage in the property-record >>>>>> (which are linked in a chain, each property-record can hold up to 4 >>>>>> 4-byte-storage properties). >>>>>> >>>>>> But arrays are optimized, esp. if you have small values in your >>>>>> weights it tries to use only the significant bits to encode values in an >>>>>> array (but I think it might only do that for integer values). >>>>>> >>>>>> Would you be able to run a test where instead of having 5-10 >>>>>> individual properties you just use an array with that many entries? >>>>>> >>>>>> And perhaps even better project the the floating point values to >>>>>> integer values in that array. >>>>>> >>>>>> I also ask our kernel engineers for other tips in this regard. >>>>>> >>>>>> HTH, >>>>>> >>>>>> Michael >>>>>> >>>>>> On Mon, Aug 8, 2016 at 6:56 PM, John Fry <[email protected]> wrote: >>>>>> >>>>>>> Hello Michael, >>>>>>> >>>>>>> the graph is used as follows: >>>>>>> >>>>>>> - ~10M nodes; ~200M relationships >>>>>>> - Each relationship requires multiple floating properties that >>>>>>> can be considered connecting strength weights. These multiple >>>>>>> weights make >>>>>>> up a weight vector - upto ~20 weights per vector >>>>>>> - The weights on the relationship are static (or at least they >>>>>>> rarely change) >>>>>>> - The weight vector is used to compute custom (very algorithmic >>>>>>> in nature) costs per link to drive node-to-node traversals, >>>>>>> expansions and >>>>>>> to find cost based n-shortest paths >>>>>>> - The costs per link are calculated in as close to real time as >>>>>>> possible and are always different and are never stored or written >>>>>>> back to >>>>>>> the relationships in the graph >>>>>>> >>>>>>> Regards, John. >>>>>>> >>>>>>> On Monday, August 8, 2016 at 12:12:13 AM UTC-7, Michael Hunger wrote: >>>>>>>> >>>>>>>> Hi John, >>>>>>>> >>>>>>>> Do you have more details on the properties that you add as well as >>>>>>>> your graph model and queries? Without these details it will be hard to >>>>>>>> help. >>>>>>>> >>>>>>>> It sounds a bit as if your property heavy relationships might be >>>>>>>> nodes in hiding. >>>>>>>> >>>>>>>> Cheers Michael >>>>>>>> >>>>>>>> >>>>>>>> Von meinem iPhone gesendet >>>>>>>> >>>>>>>> Am 08.08.2016 um 06:05 schrieb John Fry <[email protected]>: >>>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> In ne04j 2.3 what / where are the limits when storing properties on >>>>>>>> relationships? >>>>>>>> >>>>>>>> I have a graph with about 200M relationships and for each >>>>>>>> relationship I want to add floating point attributes as properties. >>>>>>>> Here is what I am experiencing: >>>>>>>> >>>>>>>> - adding 2 properties per rel - all works fine; very good >>>>>>>> performance >>>>>>>> - adding 5 properties per rel - start to see exceptions/crashes >>>>>>>> - can be fixed by turning off transaction logging - good performance >>>>>>>> - adding ~7 properties per rel - performance dramatically >>>>>>>> fades (10x slower) - occasional exceptions/crashes >>>>>>>> - adding ~10 properties per real - performance stalls/stops - >>>>>>>> eventually will crash >>>>>>>> >>>>>>>> What is a realistic set of expectations for storing this many >>>>>>>> properties where the relationship store could easily exceed > 20GB? >>>>>>>> >>>>>>>> Regards and thanks for any advice, John. >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Neo4j" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Neo4j" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
