Hi, Andrey. I was using 2.2.10 but just to be sure, I ran it a second time making sure that 2.2.10 was the first thing in my classpath and I am afraid that I saw it again. It's quite predictable (anywhere between 200 and 250 million edges).
Regards, Phillip On Monday, September 26, 2016 at 9:06:44 AM UTC+1, Andrey Lomakin wrote: > > Hi, > I have looked at your thread dump we have already identified and fixed > your issue in 2.2.9 version. > So if you use 2.2.10 (latest one), you will not experience this problem. > > I strongly recommend using 2.2.10 version because several deadlocks are > fixed in 2.2.9 version also 2.2.10 contains few minor optimizations. > > On Fri, Sep 23, 2016 at 6:51 PM Phillip Henry <phill...@gmail.com > <javascript:>> wrote: > >> Hi, Luca. >> >> > How many GB? >> >> The input file is 22gb of text. >> >> > If the file is ordered ... >> >> You are only sorting by the first account. The second account can be >> anywhere in the entire range. My understanding is that both vertices are >> updated when an edge is written. If this is true, will there not be >> potential contention when the "to" vertex is updated? >> >> > OGraphBatchInsert ... keeps everything in RAM before flushing >> >> I assume I will still have to write retry code in the event of a >> collision (see above)? >> >> > You cna use support --at- orientdb.com ... >> >> Sent. >> >> Regards, >> >> Phill >> >> On Friday, September 23, 2016 at 4:06:49 PM UTC+1, l.garulli wrote: >> >>> On 23 September 2016 at 03:50, Phillip Henry <phill...@gmail.com> wrote: >>> >>>> > How big is your file the sort cannot write? >>>> >>>> One bil-ee-on lines... :-P >>>> >>> >>> How many GB? >>> >>> >>>> > ...This should help a lot. >>>> >>>> The trouble is that the size of a block of contiguous accounts in the >>>> real data is not-uniform (even if it might be with my test data). >>>> Therefore, it is highly likely a contiguous block of account numbers will >>>> span 2 or more batches. This will lead to a lot of contention. In your >>>> example, if Account 2 spills over into the next batch, chances are I'll >>>> have to rollback that batch. >>>> >>>> Don't you also have a problem that if X, Y, Z and W in your example are >>>> account numbers in the next batch, you'll also get contention? Admittedly, >>>> randomization doesn't solve this problem either. >>>> >>> >>> If the file is ordered, you could have X threads (where X is the number >>> of cores) that parse the file not sequentially. For example with 4 threads, >>> you could start the parsing in this way: >>> >>> Thread 1, starts from 0 >>> Thread 2, starts from length * 1/4 >>> Thread 3, starts from length * 2/4 >>> Thread 1, starts from length * 3/4 >>> >>> Of course the parsing should browse until the next LF+LR if it's a CSV. >>> It requires some lines of code, but you could avoid many conflicts. >>> >>> >>>> > you can use the special Batch Importer: OGraphBatchInsert >>>> >>>> Would this not be subject to the same contention problems? >>>> At what point is it flushed to disk? (Obviously, it can't live in heap >>>> forever). >>>> >>> >>> It keeps everything in RAM before flushing. Up to a few hundreds of >>> millions of vertices/edges should be fine if you have a lot of heap, like >>> 58GB (and 4GB of DISKCACHE). It depends by the number of attributes you >>> have. >>> >>> >>>> > You should definitely using transactions with batch size of 100 >>>> items. >>>> >>>> I thought I read somewhere else (can't find the link at the moment) >>>> that you said only use transactions when using the remote protocol? >>>> >>> >>> This was true before v2.2. With v2.2 the management of the transaction >>> is parallel and very light. Transactions work well with graphs because >>> every addEdge() operation is 2 update and having a TX that works like a >>> batch really helps. >>> >>> >>>> >>>> > Please use last 2.2.10. ... try to define 50GB of DISKCACHE and 14GB >>>> of Heap >>>> >>>> Will do on the next run. >>>> >>>> > If happens again, could you please send a thread dump? >>>> >>>> I have the full thread dump but it's on my work machine so can't post >>>> it in this forum (all access to Google Groups is banned by the bank so I >>>> am >>>> writing this on my personal computer). Happy to email them to you. Which >>>> email shall I use? >>>> >>> >>> You cna use support --at- orientdb.com referring at this thread in the >>> subject. >>> >>> >>>> >>>> Phill >>>> >>> >>> >>> Best Regards, >>> >>> Luca Garulli >>> Founder & CEO >>> OrientDB LTD <http://orientdb.com/> >>> >>> Want to share your opinion about OrientDB? >>> Rate & review us at Gartner's Software Review >>> <https://www.gartner.com/reviews/survey/home> >>> >>> >>> >>>> On Friday, September 23, 2016 at 7:41:29 AM UTC+1, l.garulli wrote: >>>> >>>>> On 23 September 2016 at 00:49, Phillip Henry <phill...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, Luca. >>>>>> >>>>> >>>>> Hi Phillip. >>>>> >>>>> >>>>>> I have: >>>>>> >>>>>> 4. sorting is an overhead, albeit outside of Orient. Using the Unix >>>>>> sort command failed with "No space left on device". Oops. OK, so I ran >>>>>> my >>>>>> program to generate the data again, this time it is ordered by the first >>>>>> account number. Performance was much slower as there appeared to be a >>>>>> lot >>>>>> of contention for this account (ie, all writes were contending for this >>>>>> account, even if the other account had less contention). More randomized >>>>>> data was faster. >>>>>> >>>>> >>>>> How big is your file the sort cannot write? Anyway, if you have the >>>>> accounts sorted, you should have transactions of about 100 items where >>>>> the >>>>> bank account and edges are in the same transaction. This should help a >>>>> lot. >>>>> Example: >>>>> >>>>> Account 1 -> Payment 1 -> Account X >>>>> Account 1 -> Payment 2 -> Account Y >>>>> Account 1 -> Payment 3 -> Account Z >>>>> Account 2 -> Payment 1 -> Account X >>>>> Account 2 -> Payment 1 -> Account W >>>>> >>>>> If the transaction batch is 5 (I suggest you to start with 100), all >>>>> the operations are executed in one transaction. In another thread has: >>>>> >>>>> Account 99 -> Payment 1 -> Account W >>>>> >>>>> It could go in conflict because the shared Account W. >>>>> >>>>> If you can export Account's IDs that are numbers and incremental, you >>>>> can use the special Batch Importer: OGraphBatchInsert. Example: >>>>> >>>>> OGraphBatchInsert batch = new OGraphBatchInsert("plocal:/temp/mydb", >>>>> "admin", "admin"); >>>>> batch.begin(); >>>>> >>>>> batch.createEdge(0L, 1L, null); // CREATE EDGES BETWEEN VERTEX 0 and 1. >>>>> IF VERTICES >>>>> >>>>> // DON'T EXISTS, ARE CREATED IMPLICITELY >>>>> batch.createEdge(1L, 2L, null); >>>>> batch.createEdge(2L, 0L, null); >>>>> >>>>> >>>>> batch.createVertex(3L); // CREATE AN NON CONNECTED VERTEX >>>>> >>>>> >>>>> Map<String, Object> vertexProps = new HashMap<String, Object>(); >>>>> vertexProps.put("foo", "foo"); >>>>> vertexProps.put("bar", 3); >>>>> batch.setVertexProperties(0L, vertexProps); // SET PROPERTY FOR VERTEX 0 >>>>> batch.end(); >>>>> >>>>> This is blazing fast, but uses Heap so run it with a lot of it. >>>>> >>>>> >>>>>> >>>>>> 6. I've mutlithreaded my loader. The details are now: >>>>>> >>>>>> - using plocal >>>>>> - using 30 threads >>>>>> - not using transactions (OrientGraphFactory.getNoTx) >>>>>> >>>>> >>>>> You should definitely using transactions with batch size of 100 items. >>>>> This speeds up things. >>>>> >>>>> >>>>>> - retrying forever upon write collisions. >>>>>> - using Orient 2.2.7. >>>>>> >>>>> >>>>> Please use last 2.2.10. >>>>> >>>>> >>>>>> - using -XX:MaxDirectMemorySize:258040m >>>>>> >>>>> >>>>> This is not really important, it's just an upper bound for the JVM. >>>>> Please set it to 512GB so you can forget about it. The 2 most important >>>>> values are DISKCACHE and JVM heap. The sum must lower than the available >>>>> RAM in the server before you run OrientDB. >>>>> >>>>> If you have 64GB, try to define 50GB of DISKCACHE and 14GB of Heap. >>>>> >>>>> If you use the Batch Importer, you should use more Heap and less >>>>> DISKCACHE. >>>>> >>>>> >>>>>> The good news is I've achieved an initial write throughput of about >>>>>> 30k/second. >>>>>> >>>>>> The bad news is I've tried several runs and only been able to achieve >>>>>> 200mil < number of writes < 300mil. >>>>>> >>>>>> The first time I tried it, the loader deadlocked. Using jstat showed >>>>>> that the deadlock was between 3 threads at: >>>>>> - >>>>>> OOneKeyEntryPerKeyLockManager.acquireLock(OOneKeyEntryPerKeyLockManager.java:173) >>>>>> - >>>>>> OPartitionedLockManager.acquireExclusiveLock(OPartitionedLockManager.java:210) >>>>>> - >>>>>> OOneKeyEntryPerKeyLockManager.acquireLock(OOneKeyEntryPerKeyLockManager.java:171) >>>>>> >>>>> >>>>> If happens again, could you please send a thread dump? >>>>> >>>>> >>>>>> The second time it failed was due to a NullPointerException at >>>>>> OByteBufferPool.java:297. I've looked at the code and the only way I can >>>>>> see this happening is if OByteBufferPool.allocateBuffer throws an error >>>>>> (perhaps an OutOfMemoryError in java.nio.Bits.reserveMemory). This >>>>>> StackOverflow posting ( >>>>>> http://stackoverflow.com/questions/8462200/examples-of-forcing-freeing-of-native-memory-direct-bytebuffer-has-allocated-us) >>>>>> >>>>>> seems to indicate that this can happen if the underlying >>>>>> DirectByteBuffer's >>>>>> Cleaner doesn't have its clean() method called. >>>>>> >>>>> >>>>> This is because the database was bigger than this setting: - using >>>>> -XX:MaxDirectMemorySize:258040m. Please set this at 512GB (see above). >>>>> >>>>> >>>>>> Alternatively, I followed the SO suggestion and lowered the heap >>>>>> space to a mere 1gb (it was 50gb) to make the GC more active. >>>>>> Unfortunately, after a good start, the job is still running some 15 >>>>>> hours >>>>>> later with a hugely reduced write throughput (~ 7k/s). Jstat shows 4292 >>>>>> full GCs taking a total time of 4597s - not great but not hugely awful >>>>>> either. At this rate, the remaining 700mil or so payments are going to >>>>>> take >>>>>> another 30 hours. >>>>>> >>>>> >>>>> See above the suggested settings. >>>>> >>>>> >>>>>> 7. Even with the highest throughput I have achieved, 30k writes per >>>>>> second, I'm looking at about 20 hours of loading. We've taken the same >>>>>> data >>>>>> and, after trial and error that was not without its own problems, put it >>>>>> into Neo4J in 37 minutes. This is a significant difference. It appears >>>>>> that >>>>>> they are approaching the problem differently to avoid contention on >>>>>> updating the vertices during an edge write. >>>>>> >>>>> >>>>> With all this suggestion you should be able to have much better >>>>> numbers. If you can use the Batch Importer the number should be close to >>>>> Neo4j. >>>>> >>>>> >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> Regards, >>>>>> >>>>>> Phillip >>>>>> >>>>>> >>>>> >>>>> Best Regards, >>>>> >>>>> Luca Garulli >>>>> Founder & CEO >>>>> OrientDB LTD <http://orientdb.com/> >>>>> >>>>> Want to share your opinion about OrientDB? >>>>> Rate & review us at Gartner's Software Review >>>>> <https://www.gartner.com/reviews/survey/home> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> On Thursday, September 15, 2016 at 10:06:44 PM UTC+1, l.garulli wrote: >>>>>>> >>>>>>> On 15 September 2016 at 09:54, Phillip Henry <phill...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, Luca. >>>>>>>> >>>>>>> >>>>>>> Hi Phillip, >>>>>>> >>>>>>> 3. Yes, default configuration. Apart from adding an index for >>>>>>>> ACCOUNTS, I did nothing further. >>>>>>>> >>>>>>> >>>>>>> Ok, so you have writeQuorum="majority" that means 2 sycnhronous >>>>>>> writes and 1 asynchronous per transaction. >>>>>>> >>>>>>> >>>>>>>> 4. Good question. With real data, we expect it to be as you >>>>>>>> suggest: some nodes with the majority of the payments (eg, >>>>>>>> supermarkets). >>>>>>>> However, for the test data, payments were assigned randomly and, >>>>>>>> therefore, >>>>>>>> should be uniformly distributed. >>>>>>>> >>>>>>> >>>>>>> What's your average in terms of number of edges? <10, <50, <200, >>>>>>> <1000? >>>>>>> >>>>>>> >>>>>>>> 2. Yes, I tried plocal minutes after posting (d'oh!). I saw a good >>>>>>>> improvement. It started about 3 times faster and got faster still >>>>>>>> (about 10 >>>>>>>> times faster) by the time I checked this morning on a job running >>>>>>>> overnight. However, even though it is now running at about 7k >>>>>>>> transactions >>>>>>>> per second, a billion edges is still going to take about 40 hours. So, >>>>>>>> I >>>>>>>> ask myself: is there anyway I can make it faster still? >>>>>>>> >>>>>>> >>>>>>> Here it's missing the usage of AUTO-SHARDING INDEX. Example: >>>>>>> >>>>>>> accountClass.createIndex("Account.number", >>>>>>> OClass.INDEX_TYPE.UNIQUE.toString(), (OProgressListener) null, >>>>>>> (ODocument) null, >>>>>>> "AUTOSHARDING", new String[] { "number" }); >>>>>>> >>>>>>> In this way you should go more in parallel, because the index is >>>>>>> distributed across all the shards (clusters) of Account class. you >>>>>>> should >>>>>>> have 32 of them by default because you have 32 cores. >>>>>>> >>>>>>> Please let me know if by sorting the from_accounts and with this >>>>>>> change if it's much faster. >>>>>>> >>>>>>> This is the best you can have out of the box. To push numbers up >>>>>>> it's slightly more complicated: you should be sure that transactions go >>>>>>> in >>>>>>> parallel and they aren't serialized. This is possible by playing with >>>>>>> internal OrientDB settings (mainly the distributed workerThreads), by >>>>>>> having many clusters per class (You could try with 128 first and see >>>>>>> how >>>>>>> it's going). >>>>>>> >>>>>>> >>>>>>>> I assume when I start the servers up in distributed mode once more, >>>>>>>> the data will then be distributed across all nodes in the cluster? >>>>>>>> >>>>>>> >>>>>>> That's right. >>>>>>> >>>>>>> >>>>>>>> 3. I'll return to concurrent, remote inserts when this job has >>>>>>>> finished. Hopefully, a smaller batch size will mean there is no >>>>>>>> degradation >>>>>>>> in performance either... FYI: with a somewhat unscientific approach, I >>>>>>>> was >>>>>>>> polling the server JVM with JStack and saw only a single thread doing >>>>>>>> all >>>>>>>> the work and it *seemed* to spend a lot of its time in ODirtyManager >>>>>>>> on >>>>>>>> collection manipulation. >>>>>>>> >>>>>>> >>>>>>> I think it's because you didn't use the AUTO-SHARDING index. >>>>>>> Furthermore running distributed, unfortunately, means the tree ridbag >>>>>>> is >>>>>>> not available (we will support it in the future), so every change to >>>>>>> the >>>>>>> edges takes a lot of CPU to demarshall and marshall the entire edge >>>>>>> list >>>>>>> everytime you update a vertex. That's why my recommendation about >>>>>>> sorting >>>>>>> the vertices. >>>>>>> >>>>>>> >>>>>>>> I totally appreciate that performance tuning is an empirical >>>>>>>> science, but do you have any opinions as to which would probably be >>>>>>>> faster: >>>>>>>> single-threaded plocal or multithreaded remote? >>>>>>>> >>>>>>> >>>>>>> With v2.2 yo can go in parallel, by using the tips above. For sure >>>>>>> the replication has a cost. I'm sure you can go much faster with just >>>>>>> one >>>>>>> node and then start the other 2 nodes to have the database replicated >>>>>>> automatically. At least for the first massive insertion. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Phillip >>>>>>>> >>>>>>> >>>>>>> Luca >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Wednesday, September 14, 2016 at 3:48:56 PM UTC+1, Phillip Henry >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi, guys. >>>>>>>>> >>>>>>>>> I'm conducting a proof-of-concept for a large bank (Luca, we had a >>>>>>>>> 'phone conf on August 5...) and I'm trying to bulk insert a humongous >>>>>>>>> amount of data: 1 million vertices and 1 billion edges. >>>>>>>>> >>>>>>>>> Firstly, I'm impressed about how easy it was to configure a >>>>>>>>> cluster. However, the performance of batch inserting is bad (and >>>>>>>>> seems to >>>>>>>>> get considerably worse as I add more data). It starts at about 2k >>>>>>>>> vertices-and-edges per second and deteriorates to about 500/second >>>>>>>>> after >>>>>>>>> only about 3 million edges have been added. This also takes ~ 30 >>>>>>>>> minutes. >>>>>>>>> Needless to say that 1 billion payments (edges) will take over a week >>>>>>>>> at >>>>>>>>> this rate. >>>>>>>>> >>>>>>>>> This is a show-stopper for us. >>>>>>>>> >>>>>>>>> My data model is simply payments between accounts and I store it >>>>>>>>> in one large file. It's just 3 fields and looks like: >>>>>>>>> >>>>>>>>> FROM_ACCOUNT TO_ACCOUNT AMOUNT >>>>>>>>> >>>>>>>>> In the test data I generated, I had 1 million accounts and 1 >>>>>>>>> billion payments randomly distributed between pairs of accounts. >>>>>>>>> >>>>>>>>> I have 2 classes in OrientDB: ACCOUNTS (extending V) and PAYMENT >>>>>>>>> (extending E). There is a UNIQUE_HASH_INDEX on ACCOUNTS for the >>>>>>>>> account >>>>>>>>> number (a string). >>>>>>>>> >>>>>>>>> We're using OrientDB 2.2.7. >>>>>>>>> >>>>>>>>> My batch size is 5k and I am using the "remote" protocol to >>>>>>>>> connect to our cluster. >>>>>>>>> >>>>>>>>> I'm using JDK 8 and my 3 boxes are beefy machines (32 cores each) >>>>>>>>> but without SSDs. I wrote the importing code myself but did nothing >>>>>>>>> 'clever' (I think) and used the Graph API. This client code has been >>>>>>>>> given >>>>>>>>> lots of memory and using jstat I can see it is not excessively GCing. >>>>>>>>> >>>>>>>>> So, my questions are: >>>>>>>>> >>>>>>>>> 1. what kind of performance can I realistically expect and can I >>>>>>>>> improve what I have at the moment? >>>>>>>>> >>>>>>>>> 2. what kind of degradation should I expect as the graph grows? >>>>>>>>> >>>>>>>>> Thanks, guys. >>>>>>>>> >>>>>>>>> Phillip >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>> >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "OrientDB" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to orient-databa...@googlegroups.com. >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "OrientDB" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to orient-databa...@googlegroups.com. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "OrientDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to orient-databa...@googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "OrientDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to orient-databa...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- > Best regards, > Andrey Lomakin, R&D lead. > OrientDB Ltd > > twitter: @Andrey_Lomakin > linkedin: https://ua.linkedin.com/in/andreylomakin > blogger: http://andreylomakin.blogspot.com/ > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to orient-database+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.