[ 
https://issues.apache.org/jira/browse/SOLR-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688601#comment-15688601
 ] 

Steve Rowe commented on SOLR-9284:
----------------------------------

A couple more "OOM: Direct buffer memory" failures today on Apache Jenkins:

>From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1160/]:
{noformat}
  [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=HdfsDirectoryFactoryTest -Dtests.method=testInitArgsOrSysPropConfig 
-Dtests.seed=200C6D6D2F8C2C5F -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=zh-TW -Dtests.timezone=America/Argentina/Buenos_Aires 
-Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
   [junit4] ERROR   1.30s J0 | 
HdfsDirectoryFactoryTest.testInitArgsOrSysPropConfig <<<
   [junit4]    > Throwable #1: java.lang.RuntimeException: The max direct 
memory is likely too low.  Either increase it (by adding 
-XX:MaxDirectMemorySize=<size>g -XX:+UseLargePages to your containers startup 
args) or disable direct allocation using 
solr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml. If you 
are putting the block cache on the heap, your java heap size might not be large 
enough. Failed allocating ~134.217728 MB.
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([200C6D6D2F8C2C5F:D7A3A446D205C674]:0)
   [junit4]    >        at 
org.apache.solr.core.HdfsDirectoryFactory.createBlockCache(HdfsDirectoryFactory.java:304)
   [junit4]    >        at 
org.apache.solr.core.HdfsDirectoryFactory.getBlockDirectoryCache(HdfsDirectoryFactory.java:280)
   [junit4]    >        at 
org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:220)
   [junit4]    >        at 
org.apache.solr.core.HdfsDirectoryFactoryTest.testInitArgsOrSysPropConfig(HdfsDirectoryFactoryTest.java:108)
   [junit4]    >        at java.lang.Thread.run(Thread.java:745)
   [junit4]    > Caused by: java.lang.OutOfMemoryError: Direct buffer memory
   [junit4]    >        at java.nio.Bits.reserveMemory(Bits.java:693)
   [junit4]    >        at 
java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
   [junit4]    >        at 
java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
   [junit4]    >        at 
org.apache.solr.store.blockcache.BlockCache.<init>(BlockCache.java:68)
   [junit4]    >        at 
org.apache.solr.core.HdfsDirectoryFactory.createBlockCache(HdfsDirectoryFactory.java:302)
   [junit4]    >        ... 42 more
[...]
   [junit4]   2> 415746 ERROR 
(SUITE-HdfsDirectoryFactoryTest-seed#[200C6D6D2F8C2C5F]-worker) [    ] 
o.a.h.m.l.MethodMetric Error invoking method getBlocksTotal
   [junit4]   2> java.lang.reflect.InvocationTargetException
   [junit4]   2>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
   [junit4]   2>        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]   2>        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]   2>        at java.lang.reflect.Method.invoke(Method.java:498)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.lib.MethodMetric$2.snapshot(MethodMetric.java:111)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.lib.MethodMetric.snapshot(MethodMetric.java:144)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java:401)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:79)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:194)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   [junit4]   2>        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getClassName(DefaultMBeanServerInterceptor.java:1804)
   [junit4]   2>        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.safeGetClassName(DefaultMBeanServerInterceptor.java:1595)
   [junit4]   2>        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.checkMBeanPermission(DefaultMBeanServerInterceptor.java:1813)
   [junit4]   2>        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:430)
   [junit4]   2>        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
   [junit4]   2>        at 
com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:81)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.stopMBeans(MetricsSourceAdapter.java:226)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.stop(MetricsSourceAdapter.java:211)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.stopSources(MetricsSystemImpl.java:463)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.stop(MetricsSystemImpl.java:213)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.shutdown(MetricsSystemImpl.java:594)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.shutdownInstance(DefaultMetricsSystem.java:72)
   [junit4]   2>        at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.shutdown(DefaultMetricsSystem.java:68)
   [junit4]   2>        at 
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics.shutdown(NameNodeMetrics.java:171)
   [junit4]   2>        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:872)
   [junit4]   2>        at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1726)
   [junit4]   2>        at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1705)
   [junit4]   2>        at 
org.apache.solr.cloud.hdfs.HdfsTestUtil.teardownClass(HdfsTestUtil.java:198)
   [junit4]   2>        at 
org.apache.solr.core.HdfsDirectoryFactoryTest.teardownClass(HdfsDirectoryFactoryTest.java:61)
   [junit4]   2>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
   [junit4]   2>        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]   2>        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]   2>        at java.lang.reflect.Method.invoke(Method.java:498)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:870)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
   [junit4]   2>        at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4]   2>        at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4]   2>        at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
   [junit4]   2>        at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
   [junit4]   2>        at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
   [junit4]   2>        at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   [junit4]   2>        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> Caused by: java.lang.NullPointerException
   [junit4]   2>        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:203)
   [junit4]   2>        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getTotalBlocks(BlockManager.java:3370)
   [junit4]   2>        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlocksTotal(FSNamesystem.java:5729)
   [junit4]   2>        ... 54 more
[...]
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): {}, 
docValues:{}, maxPointsInLeafNode=174, maxMBSortInHeap=6.915978870333232, 
sim=RandomSimilarity(queryNorm=false): {}, locale=zh-TW, 
timezone=America/Argentina/Buenos_Aires
   [junit4]   2> NOTE: Linux 3.13.0-85-generic amd64/Oracle Corporation 
1.8.0_102 (64-bit)/cpus=4,threads=2,free=364870536,total=525860864
   [junit4]   2> NOTE: All tests run in this JVM: [TestFileDictionaryLookup, 
HdfsChaosMonkeySafeLeaderTest, DistributedDebugComponentTest, 
TestExactSharedStatsCache, TestLFUCache, TestFieldCacheSort, 
HdfsUnloadDistributedZkTest, TestLockTree, TestHighlightDedupGrouping, 
TestDFISimilarityFactory, SolrRequestParserTest, SyncSliceTest, 
CreateCollectionCleanupTest, HdfsDirectoryFactoryTest]
{noformat}

>From [https://builds.apache.org/job/Lucene-Solr-NightlyTests-6.x/207]:

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=BlockDirectoryTest 
-Dtests.method=testRandomAccessWritesLargeCache -Dtests.seed=85E88260B81B20E2 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-6.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=id-ID -Dtests.timezone=Africa/Libreville -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] ERROR   1.64s J1 | 
BlockDirectoryTest.testRandomAccessWritesLargeCache <<<
   [junit4]    > Throwable #1: java.lang.OutOfMemoryError: Direct buffer memory
   [junit4]    >        at java.nio.Bits.reserveMemory(Bits.java:693)
   [junit4]    >        at 
java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
   [junit4]    >        at 
java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
   [junit4]    >        at 
org.apache.solr.store.blockcache.BlockCache.<init>(BlockCache.java:68)
   [junit4]    >        at 
org.apache.solr.store.blockcache.BlockDirectoryTest.setUp(BlockDirectoryTest.java:119)
   [junit4]    >        at java.lang.Thread.run(Thread.java:745)Throwable #2: 
java.lang.NullPointerException
   [junit4]    >        at 
org.apache.solr.store.blockcache.BlockDirectoryTest.tearDown(BlockDirectoryTest.java:131)
[...]
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene62): {}, 
docValues:{}, maxPointsInLeafNode=1406, maxMBSortInHeap=7.589330986925872, 
sim=RandomSimilarity(queryNorm=false,coord=yes): {}, locale=id-ID, 
timezone=Africa/Libreville
   [junit4]   2> NOTE: Linux 3.13.0-85-generic amd64/Oracle Corporation 
1.8.0_102 (64-bit)/cpus=4,threads=1,free=317294960,total=496500736
   [junit4]   2> NOTE: All tests run in this JVM: 
[WordBreakSolrSpellCheckerTest, DocumentBuilderTest, TestHashQParserPlugin, 
TestDynamicFieldResource, DateMathParserTest, CollectionsAPIDistributedZkTest, 
HdfsRecoveryZkTest, TestDocBasedVersionConstraints, TestCloudManagedSchema, 
SpellCheckCollatorTest, HdfsBasicDistributedZkTest, TestLuceneMatchVersion, 
SpatialFilterTest, CustomCollectionTest, TestUseDocValuesAsStored2, 
TestCharFilters, BlockDirectoryTest]
{noformat}

> The HDFS BlockDirectoryCache should not let it's keysToRelease or names maps 
> grow indefinitely.
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9284
>                 URL: https://issues.apache.org/jira/browse/SOLR-9284
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: hdfs
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: master (7.0), 6.4
>
>         Attachments: SOLR-9284.patch, SOLR-9284.patch
>
>
> https://issues.apache.org/jira/browse/SOLR-9284



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to