[ https://issues.apache.org/jira/browse/SOLR-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688601#comment-15688601 ]
Steve Rowe commented on SOLR-9284: ---------------------------------- A couple more "OOM: Direct buffer memory" failures today on Apache Jenkins: >From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1160/]: {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=HdfsDirectoryFactoryTest -Dtests.method=testInitArgsOrSysPropConfig -Dtests.seed=200C6D6D2F8C2C5F -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=zh-TW -Dtests.timezone=America/Argentina/Buenos_Aires -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 [junit4] ERROR 1.30s J0 | HdfsDirectoryFactoryTest.testInitArgsOrSysPropConfig <<< [junit4] > Throwable #1: java.lang.RuntimeException: The max direct memory is likely too low. Either increase it (by adding -XX:MaxDirectMemorySize=<size>g -XX:+UseLargePages to your containers startup args) or disable direct allocation using solr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml. If you are putting the block cache on the heap, your java heap size might not be large enough. Failed allocating ~134.217728 MB. [junit4] > at __randomizedtesting.SeedInfo.seed([200C6D6D2F8C2C5F:D7A3A446D205C674]:0) [junit4] > at org.apache.solr.core.HdfsDirectoryFactory.createBlockCache(HdfsDirectoryFactory.java:304) [junit4] > at org.apache.solr.core.HdfsDirectoryFactory.getBlockDirectoryCache(HdfsDirectoryFactory.java:280) [junit4] > at org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:220) [junit4] > at org.apache.solr.core.HdfsDirectoryFactoryTest.testInitArgsOrSysPropConfig(HdfsDirectoryFactoryTest.java:108) [junit4] > at java.lang.Thread.run(Thread.java:745) [junit4] > Caused by: java.lang.OutOfMemoryError: Direct buffer memory [junit4] > at java.nio.Bits.reserveMemory(Bits.java:693) [junit4] > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) [junit4] > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) [junit4] > at org.apache.solr.store.blockcache.BlockCache.<init>(BlockCache.java:68) [junit4] > at org.apache.solr.core.HdfsDirectoryFactory.createBlockCache(HdfsDirectoryFactory.java:302) [junit4] > ... 42 more [...] [junit4] 2> 415746 ERROR (SUITE-HdfsDirectoryFactoryTest-seed#[200C6D6D2F8C2C5F]-worker) [ ] o.a.h.m.l.MethodMetric Error invoking method getBlocksTotal [junit4] 2> java.lang.reflect.InvocationTargetException [junit4] 2> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] 2> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] 2> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] 2> at java.lang.reflect.Method.invoke(Method.java:498) [junit4] 2> at org.apache.hadoop.metrics2.lib.MethodMetric$2.snapshot(MethodMetric.java:111) [junit4] 2> at org.apache.hadoop.metrics2.lib.MethodMetric.snapshot(MethodMetric.java:144) [junit4] 2> at org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java:401) [junit4] 2> at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:79) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:194) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151) [junit4] 2> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getClassName(DefaultMBeanServerInterceptor.java:1804) [junit4] 2> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.safeGetClassName(DefaultMBeanServerInterceptor.java:1595) [junit4] 2> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.checkMBeanPermission(DefaultMBeanServerInterceptor.java:1813) [junit4] 2> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:430) [junit4] 2> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) [junit4] 2> at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546) [junit4] 2> at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:81) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.stopMBeans(MetricsSourceAdapter.java:226) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.stop(MetricsSourceAdapter.java:211) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.stopSources(MetricsSystemImpl.java:463) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.stop(MetricsSystemImpl.java:213) [junit4] 2> at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.shutdown(MetricsSystemImpl.java:594) [junit4] 2> at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.shutdownInstance(DefaultMetricsSystem.java:72) [junit4] 2> at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.shutdown(DefaultMetricsSystem.java:68) [junit4] 2> at org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics.shutdown(NameNodeMetrics.java:171) [junit4] 2> at org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:872) [junit4] 2> at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1726) [junit4] 2> at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1705) [junit4] 2> at org.apache.solr.cloud.hdfs.HdfsTestUtil.teardownClass(HdfsTestUtil.java:198) [junit4] 2> at org.apache.solr.core.HdfsDirectoryFactoryTest.teardownClass(HdfsDirectoryFactoryTest.java:61) [junit4] 2> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] 2> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] 2> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] 2> at java.lang.reflect.Method.invoke(Method.java:498) [junit4] 2> at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) [junit4] 2> at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:870) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) [junit4] 2> at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [junit4] 2> at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [junit4] 2> at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) [junit4] 2> at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) [junit4] 2> at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) [junit4] 2> at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [junit4] 2> at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) [junit4] 2> at java.lang.Thread.run(Thread.java:745) [junit4] 2> Caused by: java.lang.NullPointerException [junit4] 2> at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:203) [junit4] 2> at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getTotalBlocks(BlockManager.java:3370) [junit4] 2> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlocksTotal(FSNamesystem.java:5729) [junit4] 2> ... 54 more [...] [junit4] 2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=174, maxMBSortInHeap=6.915978870333232, sim=RandomSimilarity(queryNorm=false): {}, locale=zh-TW, timezone=America/Argentina/Buenos_Aires [junit4] 2> NOTE: Linux 3.13.0-85-generic amd64/Oracle Corporation 1.8.0_102 (64-bit)/cpus=4,threads=2,free=364870536,total=525860864 [junit4] 2> NOTE: All tests run in this JVM: [TestFileDictionaryLookup, HdfsChaosMonkeySafeLeaderTest, DistributedDebugComponentTest, TestExactSharedStatsCache, TestLFUCache, TestFieldCacheSort, HdfsUnloadDistributedZkTest, TestLockTree, TestHighlightDedupGrouping, TestDFISimilarityFactory, SolrRequestParserTest, SyncSliceTest, CreateCollectionCleanupTest, HdfsDirectoryFactoryTest] {noformat} >From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-6.x/207]: {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=BlockDirectoryTest -Dtests.method=testRandomAccessWritesLargeCache -Dtests.seed=85E88260B81B20E2 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-6.x/test-data/enwiki.random.lines.txt -Dtests.locale=id-ID -Dtests.timezone=Africa/Libreville -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 1.64s J1 | BlockDirectoryTest.testRandomAccessWritesLargeCache <<< [junit4] > Throwable #1: java.lang.OutOfMemoryError: Direct buffer memory [junit4] > at java.nio.Bits.reserveMemory(Bits.java:693) [junit4] > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) [junit4] > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) [junit4] > at org.apache.solr.store.blockcache.BlockCache.<init>(BlockCache.java:68) [junit4] > at org.apache.solr.store.blockcache.BlockDirectoryTest.setUp(BlockDirectoryTest.java:119) [junit4] > at java.lang.Thread.run(Thread.java:745)Throwable #2: java.lang.NullPointerException [junit4] > at org.apache.solr.store.blockcache.BlockDirectoryTest.tearDown(BlockDirectoryTest.java:131) [...] [junit4] 2> NOTE: test params are: codec=Asserting(Lucene62): {}, docValues:{}, maxPointsInLeafNode=1406, maxMBSortInHeap=7.589330986925872, sim=RandomSimilarity(queryNorm=false,coord=yes): {}, locale=id-ID, timezone=Africa/Libreville [junit4] 2> NOTE: Linux 3.13.0-85-generic amd64/Oracle Corporation 1.8.0_102 (64-bit)/cpus=4,threads=1,free=317294960,total=496500736 [junit4] 2> NOTE: All tests run in this JVM: [WordBreakSolrSpellCheckerTest, DocumentBuilderTest, TestHashQParserPlugin, TestDynamicFieldResource, DateMathParserTest, CollectionsAPIDistributedZkTest, HdfsRecoveryZkTest, TestDocBasedVersionConstraints, TestCloudManagedSchema, SpellCheckCollatorTest, HdfsBasicDistributedZkTest, TestLuceneMatchVersion, SpatialFilterTest, CustomCollectionTest, TestUseDocValuesAsStored2, TestCharFilters, BlockDirectoryTest] {noformat} > The HDFS BlockDirectoryCache should not let it's keysToRelease or names maps > grow indefinitely. > ----------------------------------------------------------------------------------------------- > > Key: SOLR-9284 > URL: https://issues.apache.org/jira/browse/SOLR-9284 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: hdfs > Reporter: Mark Miller > Assignee: Mark Miller > Fix For: master (7.0), 6.4 > > Attachments: SOLR-9284.patch, SOLR-9284.patch > > > https://issues.apache.org/jira/browse/SOLR-9284 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org