[
https://issues.apache.org/jira/browse/NET-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642642#comment-17642642
]
Gary D. Gregory commented on NET-709:
-------------------------------------
I think the proper way to address this is many-fold, maybe:
* Add a hard configurable limit and throw an exception if is met
* Add a soft configurable limit, truncate and ignore anything beyond that
I would reject the above two based on what the route JRE NIO has taken: Provide
an API that returns a Stream.
This gives you full compatibility using the old API without complications and
new bugs AND a new API where memory usage will be as low as the client wants it
to be.
> IMAP Memory considerations with large ‘FETCH’ sizes.
> ----------------------------------------------------
>
> Key: NET-709
> URL: https://issues.apache.org/jira/browse/NET-709
> Project: Commons Net
> Issue Type: Improvement
> Components: IMAP
> Affects Versions: 3.8.0
> Reporter: Anders
> Priority: Minor
> Labels: IMAP, buffer, chunking, large, literal, memory, partial
> Fix For: 3.10.0
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> h2. *IMAP Memory considerations with large ‘FETCH’ sizes.*
>
> The following comments concern classes in the
> [org.apache.common.net.imap|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/package-summary.html]
> package.
>
> Consider the following imap ‘fetch’ exchange between a client (>) and server
> (<):
> {{> A654 FETCH 1:2 (BODY[TEXT])}}
> {{{}< * 1 FETCH (BODY[TEXT] {*}{{*}{*}80000000{*}{*}}{*}\r\n{}}}{{{}…{}}}
> {{< * 2 FETCH …}}
> {{< A654 OK FETCH completed}}
>
> The first untagged response (* 1 FETCH …) contains a literal \{80000000} or
> 80MB.
>
> After reviewing the
> [source|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298],
> it is my understanding, the entire 80MB sequence of data will be read into
> Java memory even when using
> ‘[IMAPChunkListener|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/IMAP.IMAPChunkListener.html]’.
> According the the documentation:
>
> {quote}Implement this interface and register it via
> [IMAP.setChunkListener(IMAPChunkListener)|https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/imap/IMAP.html#setChunkListener-org.apache.commons.net.imap.IMAP.IMAPChunkListener-]
> in order to get access to multi-line partial command responses. Useful when
> processing large FETCH responses.
> {quote}
>
> It is apparent the partial fetch response is read in full (80MB) before
> invoking the ‘IMAPChunkListener’ and then discarding the read lines (freeing
> up memory).
>
> Back to the example:
> > A654 FETCH 1:2 (BODY[TEXT])
> < * 1 FETCH (BODY[TEXT] \{80000000}\r\n
> *{color:#ff0000}…. <— read in full into memory then discarded after calling
> IMAPChunkListener{color}*
> < * 2 FETCH (BODY[TEXT] \{250}\r\n
> {color:#ff0000}*…. <— read in full into memory then discarded after calling
> IMAPChunkListener*{color}
> < A654 OK FETCH completed
>
> Above, you can see the chunk listener is good for each individual partial
> fetch response but does not prevent a large partial from being loaded into
> memory.
>
> Let’s review the
> [code|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298]:
>
> [
> 296|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l296]
> int literalCount = IMAPReply.literalCount(line);
> {color:#ff0000}Above counts the size of the literal, in our case 80000000 or
> 80MB (for the first partial fetch response).{color}
>
>
> [
> 297|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l297]
> final boolean isMultiLine = literalCount >= 0;
> [
> 298|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l298]
> while (literalCount >= 0) {
> [
> 299|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l299]
> line=_reader.readLine();
> [
> 300|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l300]
> if (line == null)
> { throw new EOFException("Connection closed
> without indication."); }
> [
> 303|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l303]
> replyLines.add(line);
> [
> 304|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l304]
> literalCount -= line.length() + 2; // Allow for CRLF
> [
> 305|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l305]
> }
> {color:#ff0000}The literal count above starts at 80000000 and is decremented
> until reaching a negative non zero value where 80MB is read in full and while
> loop returns.{color}
>
> [
> 306|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l306]
> if (isMultiLine) {
> [
> 307|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l307]
> final IMAPChunkListener il = chunkListener;
> [
> 308|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l308]
> if (il != null) {
> [
> 309|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l309]
> final boolean clear = il.chunkReceived(this);
> [
> 310|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l310]
> if (clear) {
> [
> 311|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l311]
> fireReplyReceived(IMAPReply.PARTIAL,
> getReplyString());
> [
> 312|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l312]
> replyLines.clear();
> {color:#ff0000}Now, after all 80MB is loaded into memory in full, invoke the
> IMAPChunkListener and throw away the lines freeing memory.{color}
>
> [
> 313|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l313]
> }
> [
> 314|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l314]
> }
> [
> 315|https://gitbox.apache.org/repos/asf?p=commons-net.git;a=blob;f=src/main/java/org/apache/commons/net/imap/IMAP.java;h=d97f1073d8b97545d0a063c6832fe55c116166e2;hb=HEAD#l315]
> }
>
> I’m considering modifying the getReply() method, shown above, to chunk the
> partial responses breaking up the literal so that it’s not loaded into memory
> in full. This is to prevent the entire 80MB literal value from being loaded
> into memory.
>
> This would be configurable as not to break the existing users of the API.
> Something like .setBreakLargeLiteralSize(true), when breakUpLargeLiteralSize
> is true, a maxLiteralBuffer value is used to chunk the literal preventing all
> 80MB from being loaded in full, instead loading chunks of it. This would
> require implementations of IMAPChunkListener to handle this behavior if it
> was turned on. The default behavior will see this chunking disabled as to not
> break the existing users. Essentially an opt-in feature reducing the risk.
>
> *{color:#403294}What are you thoughts or concerns with this? Do you
> agree?{color}*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)