[
https://issues.apache.org/jira/browse/SOLR-10375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152057#comment-16152057
]
David Smiley commented on SOLR-10375:
-------------------------------------
bq. Also happens on an insert
Good catch [~mbraun688]; file an issue for that.
> Stored text retrieved via StoredFieldVisitor on doc in the document cache
> over-estimates needed byte[]
> ------------------------------------------------------------------------------------------------------
>
> Key: SOLR-10375
> URL: https://issues.apache.org/jira/browse/SOLR-10375
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Environment: Java 1.8.121, Linux x64
> Reporter: Michael Braun
> Priority: Minor
>
> Using SolrIndexSearcher.doc(int n, StoredFieldVisitor visitor) (as can
> happen with the UnifiedHighlighter in particular)
> If the document cache has the document, will call visitFromCached, will get
> an out of memory error because of line 752 of SolrIndexSearcher -
> visitor.stringField(info, f.stringValue().getBytes(StandardCharsets.UTF_8));
> {code}
> at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
> at java.lang.StringCoding.encode(Ljava/nio/charset/Charset;[CII)[B
> (StringCoding.java:350)
> at java.lang.String.getBytes(Ljava/nio/charset/Charset;)[B (String.java:941)
> at
> org.apache.solr.search.SolrIndexSearcher.visitFromCached(Lorg/apache/lucene/document/Document;Lorg/apache/lucene/index/StoredFieldVisitor;)V
> (SolrIndexSearcher.java:685)
> at
> org.apache.solr.search.SolrIndexSearcher.doc(ILorg/apache/lucene/index/StoredFieldVisitor;)V
> (SolrIndexSearcher.java:652)
> {code}
> This is due to the current String.getBytes(Charset) implementation, which
> allocates the underlying byte array as a function of
> charArrayLength*maxBytesPerCharacter, which for UTF-8 is 3. 3 * 716MB is
> over Integer.MAX, and the JVM cannot allocate over this, so an out of memory
> exception is thrown because the allocation of this much memory for a single
> array is currently impossible.
> The problem is not present when the document cache is disabled.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]