After 70 hours on a 32GB VM, ODB 2.2.20, JRE 8u131, the job failed with a direct buffer memory exception. Given the complications I mentioned above, my next step is going to be to get a high-RAM AWS EC2 instance and run this there. However, as I mentioned above, my leadership is getting frustrated with this situation.
-- John 'Battle of banja luka'. com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #63:19090001 (cluster: xlink_simple_2) DB name="kb" at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:2050) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:187) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:162) at com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:291) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1729) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:102) at com.orientechnologies.orient.core.id.ORecordId.getRecord(ORecordId.java:329) at com.tinkerpop.blueprints.impls.orient.OrientEdgeIterator.createGraphElement(OrientEdgeIterator.java:72) at com.tinkerpop.blueprints.impls.orient.OrientEdgeIterator.createGraphElement(OrientEdgeIterator.java:44) at com.orientechnologies.orient.core.iterator.OLazyWrapperIterator.hasNext(OLazyWrapperIterator.java:93) at com.orientechnologies.common.collection.OMultiCollectionIterator.hasNextInternal(OMultiCollectionIterator.java:97) at com.orientechnologies.common.collection.OMultiCollectionIterator.hasNext(OMultiCollectionIterator.java:78) at com.lusidity.mind.model.Node.getLinks(Node.java:308) at com.lusidity.mind.model.Node.hasLink(Node.java:435) at com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.loadHyperlinks(BaseMediaWikiPage.java:401) at com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.link(BaseMediaWikiPage.java:260) at com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.load(BaseMediaWikiPage.java:240) at com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.process(BaseMediaWikiPage.java:98) at com.lusidity.mind.etl.providers.mediawiki.ArticleHandler.process(ArticleHandler.java:113) at com.lusidity.mind.etl.providers.mediawiki.ArticleHandler.process(ArticleHandler.java:75) at info.bliki.wiki.dump.WikiXMLParser.endElement(WikiXMLParser.java:155) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2967) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at info.bliki.wiki.dump.WikiXMLParser.parse(WikiXMLParser.java:194) at com.lusidity.mind.etl.providers.mediawiki.MediaWiki.run(MediaWiki.java:133) at com.lusidity.mind.etl.providers.mediawiki.MediaWiki.run(MediaWiki.java:110) at com.lusidity.mind.shell.commands.ImportCommand.execute(ImportCommand.java:105) at com.lusidity.mind.shell.Shell.execute(Shell.java:265) at com.lusidity.mind.shell.Shell.execute(Shell.java:214) at com.lusidity.mind.shell.commands.ExecCommand.lambda$execute$0(ExecCommand.java:82) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.ReferencePipeline$Head.forEachOrdered(ReferencePipeline.java:590) at com.lusidity.mind.shell.commands.ExecCommand.execute(ExecCommand.java:78) at com.lusidity.mind.shell.Shell.execute(Shell.java:265) at com.lusidity.mind.shell.Shell.execute(Shell.java:214) at com.lusidity.mind.shell.Shell.run(Shell.java:173) at com.lusidity.mind.Program.runInteractive(Program.java:209) at com.lusidity.mind.Program.run(Program.java:170) at com.lusidity.mind.Program.main(Program.java:102) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:694) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at com.orientechnologies.common.directmemory.OByteBufferPool.allocateBuffer(OByteBufferPool.java:328) at com.orientechnologies.common.directmemory.OByteBufferPool.acquireDirect(OByteBufferPool.java:279) at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.cacheFileContent(OWOWCache.java:1280) at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.load(OWOWCache.java:656) at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.updateCache(O2QCache.java:1102) at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.doLoad(O2QCache.java:353) at com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.load(O2QCache.java:298) at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.loadPage(ODurableComponent.java:148) at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.readRecordBuffer(OPaginatedCluster.java:691) at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.readRecord(OPaginatedCluster.java:667) at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.readRecord(OPaginatedCluster.java:646) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doReadRecord(OAbstractPaginatedStorage.java:3260) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.readRecord(OAbstractPaginatedStorage.java:2879) at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.readRecord(OAbstractPaginatedStorage.java:1064) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx$SimpleRecordReader.readRecord(ODatabaseDocumentTx.java:3436) at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:2012) ... 47 common frames omitted On Tue, May 16, 2017 at 11:42 AM, John J. Szucs <[email protected]> wrote: > I've had some "complications" (namely, being hospitalized for a medical > issue), but I am running the job right now with OrientDB 2.2.20 and JRE > 8u131. It's only a 32GB VM for now, but it's almost 50% complete and the > results are good so far. > > On Mon, May 15, 2017 at 10:29 AM, Claudio Massi <[email protected]> > wrote: > >> Hi John, >> if you have 64gb ram, to avoid swapping jvm, try to keep process size >> below 64gb, so use Xmx + MaxDirectMemorySize below the available ram >> >> Try orientdb 2.2.20 with java 8u131-b11 , if using G1GC >> >> Monitor heap usage with: jstat -gc pid 120s 9999999 >> >> Monitor direct memory usage with any jmx tool (see >> http://andreylomakin.blogspot.it/2016/05/how-to-calculate-ma >> ximum-amount-of.html ) >> - use jconsole, section MBeans, choose >> com.orientechnologies.common.directmemory >> -> OByteBufferPoolMXBean -> Attribute >> - use MonBuffers.java (Source from Alan B. in >> https://gist.github.com/t3rmin4t0r/1a753ccdcfa8d111f07c then increment >> Thread.sleep(2000), and run adding tools.jar in classpath ) >> - use jmxterm (http://wiki.cyclopsgroup.org/jmxterm/) >> ... >> >> Claudio >> >> Il giorno venerdì 5 maggio 2017 18:57:26 UTC+2, John J. Szucs ha scritto: >>> >>> Andrey, >>> >>> THANK YOU! I will give this a try as soon as I can. >>> >>> I will also do some JVM profi >>> >>> — John >>> >>> On May 5, 2017, at 05:05, Andrey Lomakin <[email protected]> wrote: >>> >>> Hi John, >>> If you wish you could use this build till we will do official release >>> https://drive.google.com/file/d/0B2oZq2xVp841T2diVGt >>> TcmZ5OTQ/view?usp=sharing >>> >>> On Fri, May 5, 2017 at 11:58 AM Andrey Lomakin <[email protected]> >>> wrote: >>> >>>> HI John, >>>> >>>> I suppose you encountered issue https://github.com/orien >>>> technologies/orientdb/issues/7390 >>>> We will provide release soon. >>>> >>>> Also please do not use such huge heap size we use heap only to keep >>>> temporary data, so I suggest you lower heap size to get ODB the chance to >>>> use more direct memory. >>>> >>>> On Fri, May 5, 2017 at 10:51 AM Luigi Dell'Aquila < >>>> [email protected]> wrote: >>>> >>>>> Hi John, >>>>> >>>>> How are you doing the import? Are you working in transaction? Some >>>>> code will help us understand where the problem is >>>>> >>>>> Thanks >>>>> >>>>> Luigi >>>>> >>>>> >>>>> 2017-05-05 3:53 GMT+02:00 John J. Szucs <[email protected]>: >>>>> >>>>>> Hello, OrientDB community! It's me again with another question. >>>>>> >>>>>> I am still working on my project and have encountered another serious >>>>>> challenge: it seems that writing to indices (especially edge indices?) >>>>>> can >>>>>> cause OrientDB's direct (non-JVM) memory usage to grow without bounds >>>>>> until >>>>>> the system effectively grinds to a halt due to swap. >>>>>> >>>>>> The specific use case is building a graph based on (English) >>>>>> Wikipedia. There are approximately 17.4M vertices representing pages >>>>>> (including articles, categories, and various meta pages). These vertices >>>>>> are connected by approximately 65M (at last count) edges. There are a few >>>>>> super-nodes. For example, the vertex representing https://en.wikipe >>>>>> dia.org/wiki/United_States has (at last count) 306K incoming edges >>>>>> and 822 outgoing edges. However, the degree of the vertices roughly >>>>>> follows >>>>>> a Zipf distribution and the vast majority of vertices have only a few >>>>>> (<10) >>>>>> total (in and out) edges. There are also some other vertex and edge types >>>>>> for lexical data, but I think those are secondary to the issue. >>>>>> >>>>>> Per previous discussion here and on StackOverflow, I have added >>>>>> automatic edge indices on in, out, or the composite of the two to >>>>>> optimize >>>>>> edge queries. When I run the process to extract, transform, and load the >>>>>> data from Wikipedia's XML dumps (using my own ETL code, not OrientDB's), >>>>>> after 24-48 hours, the Linux System Monitor shows that physical memory >>>>>> usage has reached 99.9% and then swap usage begins to grow. At this >>>>>> point, >>>>>> the process is effectively halted by swap thrashing. >>>>>> >>>>>> I am running this on a Fedora 25 Linux VM with 64GB RAM and 16 CPU >>>>>> cores allocated. The JVM settings are as follows: >>>>>> >>>>>> -Xmx32g -Xms32g -server -XX:+PerfDisableSharedMem -XX:+UseG1GC >>>>>> -XX:MaxDirectMemorySize=64413m -Dstorage.wal.syncOnPageFlush=false >>>>>> >>>>>> The MaxDirectMemorySize parameter is recommended by OrientDB itself, >>>>>> during start-up with the "out-of-memory errors" warning. It does seem odd >>>>>> to me that Xmx+MaxDirectMemorySize>available RAM, but I'm more of a >>>>>> deep R&D (not DevOps) guy, so I'm just accepting that unless someone >>>>>> advises me otherwise. >>>>>> >>>>>> If I disable the edge indices, then the process runs fine and >>>>>> completes in a "reasonable" (for it) amount of time: 2-3 days. Of course, >>>>>> if I do this, my run-time performance suffers intolerably. >>>>>> >>>>>> I am running this with OrientDB 2.2.19. I was able to quickly get my >>>>>> code to build with 3.0 M1, but some of the unit tests fail and I am under >>>>>> far too much pressure about this issue from my leadership to try to >>>>>> troubleshoot them right now. >>>>>> >>>>>> What can I do to solve this issue? Thanks in advance for your help! >>>>>> >>>>>> -- John >>>>>> >>>>>> -- >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "OrientDB" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "OrientDB" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>>> Best regards, >>>> Andrey Lomakin, R&D lead. >>>> OrientDB Ltd >>>> >>>> twitter: @Andrey_Lomakin >>>> linkedin: https://ua.linkedin.com/in/andreylomakin >>>> blogger: http://andreylomakin.blogspot.com/ >>>> >>> -- >>> Best regards, >>> Andrey Lomakin, R&D lead. >>> OrientDB Ltd >>> >>> twitter: @Andrey_Lomakin >>> linkedin: https://ua.linkedin.com/in/andreylomakin >>> blogger: http://andreylomakin.blogspot.com/ >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "OrientDB" group. >>> To unsubscribe from this topic, visit https://groups.google.co >>> m/d/topic/orient-database/p0JF5IGsqcs/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >>> >>> -- >> >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "OrientDB" group. >> To unsubscribe from this topic, visit https://groups.google.com/d/to >> pic/orient-database/p0JF5IGsqcs/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
