Re: [orientdb] Indices and Memory Usage

John J. Szucs Wed, 17 May 2017 09:07:27 -0700

After 70 hours on a 32GB VM, ODB 2.2.20, JRE 8u131, the job failed with a
direct buffer memory exception. Given the complications I mentioned above,
my next step is going to be to get a high-RAM AWS EC2 instance and run this
there. However, as I mentioned above, my leadership is getting frustrated
with this situation.

-- John

'Battle of banja luka'.
com.orientechnologies.orient.core.exception.ODatabaseException: Error on
retrieving record #63:19090001 (cluster: xlink_simple_2)

DB name="kb"
at
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:2050)
at
com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:187)
at
com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:162)
at
com.orientechnologies.orient.core.tx.OTransactionOptimistic.loadRecord(OTransactionOptimistic.java:291)
at
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1729)
at
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:102)
at
com.orientechnologies.orient.core.id.ORecordId.getRecord(ORecordId.java:329)
at
com.tinkerpop.blueprints.impls.orient.OrientEdgeIterator.createGraphElement(OrientEdgeIterator.java:72)
at
com.tinkerpop.blueprints.impls.orient.OrientEdgeIterator.createGraphElement(OrientEdgeIterator.java:44)
at
com.orientechnologies.orient.core.iterator.OLazyWrapperIterator.hasNext(OLazyWrapperIterator.java:93)
at
com.orientechnologies.common.collection.OMultiCollectionIterator.hasNextInternal(OMultiCollectionIterator.java:97)
at
com.orientechnologies.common.collection.OMultiCollectionIterator.hasNext(OMultiCollectionIterator.java:78)
at com.lusidity.mind.model.Node.getLinks(Node.java:308)
at com.lusidity.mind.model.Node.hasLink(Node.java:435)
at
com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.loadHyperlinks(BaseMediaWikiPage.java:401)
at
com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.link(BaseMediaWikiPage.java:260)
at
com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.load(BaseMediaWikiPage.java:240)
at
com.lusidity.mind.etl.providers.mediawiki.BaseMediaWikiPage.process(BaseMediaWikiPage.java:98)
at
com.lusidity.mind.etl.providers.mediawiki.ArticleHandler.process(ArticleHandler.java:113)
at
com.lusidity.mind.etl.providers.mediawiki.ArticleHandler.process(ArticleHandler.java:75)
at info.bliki.wiki.dump.WikiXMLParser.endElement(WikiXMLParser.java:155)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2967)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
at
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at info.bliki.wiki.dump.WikiXMLParser.parse(WikiXMLParser.java:194)
at
com.lusidity.mind.etl.providers.mediawiki.MediaWiki.run(MediaWiki.java:133)
at
com.lusidity.mind.etl.providers.mediawiki.MediaWiki.run(MediaWiki.java:110)
at
com.lusidity.mind.shell.commands.ImportCommand.execute(ImportCommand.java:105)
at com.lusidity.mind.shell.Shell.execute(Shell.java:265)
at com.lusidity.mind.shell.Shell.execute(Shell.java:214)
at
com.lusidity.mind.shell.commands.ExecCommand.lambda$execute$0(ExecCommand.java:82)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at
java.util.stream.ReferencePipeline$Head.forEachOrdered(ReferencePipeline.java:590)
at com.lusidity.mind.shell.commands.ExecCommand.execute(ExecCommand.java:78)
at com.lusidity.mind.shell.Shell.execute(Shell.java:265)
at com.lusidity.mind.shell.Shell.execute(Shell.java:214)
at com.lusidity.mind.shell.Shell.run(Shell.java:173)
at com.lusidity.mind.Program.runInteractive(Program.java:209)
at com.lusidity.mind.Program.run(Program.java:170)
at com.lusidity.mind.Program.main(Program.java:102)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:694)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at
com.orientechnologies.common.directmemory.OByteBufferPool.allocateBuffer(OByteBufferPool.java:328)
at
com.orientechnologies.common.directmemory.OByteBufferPool.acquireDirect(OByteBufferPool.java:279)
at
com.orientechnologies.orient.core.storage.cache.local.OWOWCache.cacheFileContent(OWOWCache.java:1280)
at
com.orientechnologies.orient.core.storage.cache.local.OWOWCache.load(OWOWCache.java:656)
at
com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.updateCache(O2QCache.java:1102)
at
com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.doLoad(O2QCache.java:353)
at
com.orientechnologies.orient.core.storage.cache.local.twoq.O2QCache.load(O2QCache.java:298)
at
com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.loadPage(ODurableComponent.java:148)
at
com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.readRecordBuffer(OPaginatedCluster.java:691)
at
com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.readRecord(OPaginatedCluster.java:667)
at
com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.readRecord(OPaginatedCluster.java:646)
at
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doReadRecord(OAbstractPaginatedStorage.java:3260)
at
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.readRecord(OAbstractPaginatedStorage.java:2879)
at
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.readRecord(OAbstractPaginatedStorage.java:1064)
at
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx$SimpleRecordReader.readRecord(ODatabaseDocumentTx.java:3436)
at
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:2012)
... 47 common frames omitted

On Tue, May 16, 2017 at 11:42 AM, John J. Szucs <[email protected]>
wrote:

> I've had some "complications" (namely, being hospitalized for a medical
> issue), but I am running the job right now with OrientDB 2.2.20 and JRE
> 8u131. It's only a 32GB VM for now, but it's almost 50% complete and the
> results are good so far.
>
> On Mon, May 15, 2017 at 10:29 AM, Claudio Massi <[email protected]>
> wrote:
>
>> Hi John,
>>    if you have 64gb ram, to avoid swapping jvm, try to keep process size
>> below 64gb, so use Xmx + MaxDirectMemorySize below the available ram
>>
>> Try orientdb 2.2.20 with java 8u131-b11 , if using G1GC
>>
>> Monitor heap usage with: jstat -gc  pid 120s 9999999
>>
>> Monitor direct memory usage with any jmx tool (see
>> http://andreylomakin.blogspot.it/2016/05/how-to-calculate-ma
>> ximum-amount-of.html )
>> - use jconsole, section MBeans, choose  
>> com.orientechnologies.common.directmemory
>> -> OByteBufferPoolMXBean -> Attribute
>> - use MonBuffers.java (Source from Alan B. in
>> https://gist.github.com/t3rmin4t0r/1a753ccdcfa8d111f07c  then increment
>> Thread.sleep(2000), and run adding tools.jar in classpath )
>> - use jmxterm (http://wiki.cyclopsgroup.org/jmxterm/)
>> ...
>>
>> Claudio
>>
>> Il giorno venerdì 5 maggio 2017 18:57:26 UTC+2, John J. Szucs ha scritto:
>>>
>>> Andrey,
>>>
>>> THANK YOU! I will give this a try as soon as I can.
>>>
>>> I will also do some JVM profi
>>>
>>> — John
>>>
>>> On May 5, 2017, at 05:05, Andrey Lomakin <[email protected]> wrote:
>>>
>>> Hi John,
>>> If you wish you could use this build till we will do official release
>>> https://drive.google.com/file/d/0B2oZq2xVp841T2diVGt
>>> TcmZ5OTQ/view?usp=sharing
>>>
>>> On Fri, May 5, 2017 at 11:58 AM Andrey Lomakin <[email protected]>
>>> wrote:
>>>
>>>> HI John,
>>>>
>>>> I suppose you encountered issue https://github.com/orien
>>>> technologies/orientdb/issues/7390
>>>> We will provide release soon.
>>>>
>>>> Also please do not use such huge heap size we use heap only to keep
>>>> temporary data, so I suggest you lower heap size to get ODB the chance to
>>>> use more direct memory.
>>>>
>>>> On Fri, May 5, 2017 at 10:51 AM Luigi Dell'Aquila <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> How are you doing the import? Are you working in transaction? Some
>>>>> code will help us understand where the problem is
>>>>>
>>>>> Thanks
>>>>>
>>>>> Luigi
>>>>>
>>>>>
>>>>> 2017-05-05 3:53 GMT+02:00 John J. Szucs <[email protected]>:
>>>>>
>>>>>> Hello, OrientDB community! It's me again with another question.
>>>>>>
>>>>>> I am still working on my project and have encountered another serious
>>>>>> challenge: it seems that writing to indices (especially edge indices?) 
>>>>>> can
>>>>>> cause OrientDB's direct (non-JVM) memory usage to grow without bounds 
>>>>>> until
>>>>>> the system effectively grinds to a halt due to swap.
>>>>>>
>>>>>> The specific use case is building a graph based on (English)
>>>>>> Wikipedia. There are approximately 17.4M vertices representing pages
>>>>>> (including articles, categories, and various meta pages). These vertices
>>>>>> are connected by approximately 65M (at last count) edges. There are a few
>>>>>> super-nodes. For example, the vertex representing https://en.wikipe
>>>>>> dia.org/wiki/United_States has (at last count) 306K incoming edges
>>>>>> and 822 outgoing edges. However, the degree of the vertices roughly 
>>>>>> follows
>>>>>> a Zipf distribution and the vast majority of vertices have only a few 
>>>>>> (<10)
>>>>>> total (in and out) edges. There are also some other vertex and edge types
>>>>>> for lexical data, but I think those are secondary to the issue.
>>>>>>
>>>>>> Per previous discussion here and on StackOverflow, I have added
>>>>>> automatic edge indices on in, out, or the composite of the two to 
>>>>>> optimize
>>>>>> edge queries. When I run the process to extract, transform, and load the
>>>>>> data from Wikipedia's XML dumps (using my own ETL code, not OrientDB's),
>>>>>> after 24-48 hours, the Linux System Monitor shows that physical memory
>>>>>> usage has reached 99.9% and then swap usage begins to grow. At this 
>>>>>> point,
>>>>>> the process is effectively halted by swap thrashing.
>>>>>>
>>>>>> I am running this on a Fedora 25 Linux VM with 64GB RAM and 16 CPU
>>>>>> cores allocated. The JVM settings are as follows:
>>>>>>
>>>>>> -Xmx32g -Xms32g -server -XX:+PerfDisableSharedMem -XX:+UseG1GC
>>>>>> -XX:MaxDirectMemorySize=64413m -Dstorage.wal.syncOnPageFlush=false
>>>>>>
>>>>>> The MaxDirectMemorySize parameter is recommended by OrientDB itself,
>>>>>> during start-up with the "out-of-memory errors" warning. It does seem odd
>>>>>> to me that Xmx+MaxDirectMemorySize>available RAM, but I'm more of a
>>>>>> deep R&D (not DevOps) guy, so I'm just accepting that unless someone
>>>>>> advises me otherwise.
>>>>>>
>>>>>> If I disable the edge indices, then the process runs fine and
>>>>>> completes in a "reasonable" (for it) amount of time: 2-3 days. Of course,
>>>>>> if I do this, my run-time performance suffers intolerably.
>>>>>>
>>>>>> I am running this with OrientDB 2.2.19. I was able to quickly get my
>>>>>> code to build with 3.0 M1, but some of the unit tests fail and I am under
>>>>>> far too much pressure about this issue from my leadership to try to
>>>>>> troubleshoot them right now.
>>>>>>
>>>>>> What can I do to solve this issue? Thanks in advance for your help!
>>>>>>
>>>>>> -- John
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> --
>>>> Best regards,
>>>> Andrey Lomakin, R&D lead.
>>>> OrientDB Ltd
>>>>
>>>> twitter: @Andrey_Lomakin
>>>> linkedin: https://ua.linkedin.com/in/andreylomakin
>>>> blogger: http://andreylomakin.blogspot.com/
>>>>
>>> --
>>> Best regards,
>>> Andrey Lomakin, R&D lead.
>>> OrientDB Ltd
>>>
>>> twitter: @Andrey_Lomakin
>>> linkedin: https://ua.linkedin.com/in/andreylomakin
>>> blogger: http://andreylomakin.blogspot.com/
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "OrientDB" group.
>>> To unsubscribe from this topic, visit https://groups.google.co
>>> m/d/topic/orient-database/p0JF5IGsqcs/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "OrientDB" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/orient-database/p0JF5IGsqcs/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Indices and Memory Usage

Reply via email to