[
https://issues.apache.org/jira/browse/SOLR-16265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558690#comment-17558690
]
Chris M. Hostetter commented on SOLR-16265:
-------------------------------------------
Given that {{Http2SolrClient.createRequest}} is private and only used in a few
places, it seems like we could probably rethink it's API a bit to avoid needing
the {{ByteArrayOutputStream}} and let the {{ContentWriter}} write directly to
an {{OutputStreamContentProvider}} ... but skimming the jetty docs on
{{ContentProvider}} I gather this might have some behavior changes relating to
retries.
So perhaps we could have our own custom {{ContentProvider}} that works similar
to {{OutputStreamContentProvider}} but makes a new call to
{{ContentWriter.write(...)}} each time the {{iterator()}} is called?
But in the meantime, just switching the {{ByteArrayOutputStream}} to use the
existing {{BinaryRequestWriter.BAOS}} class would eliminate the {{byte[]}} copy
and give a quick/small improvement ... so I'll open a sub-task for that
> reduce memory usage of ContentWriter based requests in Http2SolrClient
> ----------------------------------------------------------------------
>
> Key: SOLR-16265
> URL: https://issues.apache.org/jira/browse/SOLR-16265
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Chris M. Hostetter
> Priority: Major
>
> I recently noticed the code below exists in
> {{Http2SolrClient.createRequest}}...
> {code}
> if (contentWriter != null) {
> Request req = httpClient.newRequest(url +
> wparams.toQueryString()).method(method);
> ByteArrayOutputStream baos = new ByteArrayOutputStream();
> contentWriter.write(baos);
> // TODO reduce memory usage
> return req.content(
> new BytesContentProvider(contentWriter.getContentType(),
> baos.toByteArray()));
> {code}
> * AFAICT there is no (other) existing jira discussing this TODO
> * This method is called for most "simple" HTTP2 based requests
> ** {{Http2SolrClient}} or {{CloudHttp2SolrClient}} -- but not
> {{ConcurrentUpdateHttp2SolrClient}}
> * This block triggers for anything with a {{ContentWriter}}
> ** ie: all {{UpdateRequests}} ... and in theory other custom requests
> * Part of the issue seems to be that this code repurposes the
> {{ContentWriter}} "push" style API into a "pull" style Jetty client API
> ** Even though {{Http2SolrClient}} has other code used only by
> {{ConcurrentUpdateHttp2SolrClient}} ({{initOutStream(...)}}) which does
> leverage a "push" style Jetty client API: {{OutputStreamContentProvider}}
> * But more silly: we make one (serialized) {{byte[]}} of the data in memory
> inside the {{ByteArrayOutputStream}} then we call {{toByteArray()}} which
> makes a second copy of the {{byte[]}}.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]