Re: gzip compression solr 8.4.1
check out the videos on this website TROO.TUBE don't be such a sheep/zombie/loser/NPC. Much love! https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219 On Tue, May 5, 2020 at 3:33 AM Johannes Siegert wrote: > > Hi, > > We did further tests to see where the problem exactly is. These are our > outcomes: > > The content-length is calculated correctly, a quick test with curl showed > this. > The problem is that the stream with the gzip data is not fully consumed and > afterwards not closed. > > Using the debugger with a breakpoint at > org/apache/solr/common/util/Utils.java:575 shows that it won't enter the > function readFully((entity.getContent()) most likely due to how the gzip > stream content is wrapped and extracted beforehand. > > On line org/apache/solr/common/util/Utils.java:582 the > consumeQuietly(entity) should close the stream but does not because of a > silent exception. > > This seems to be the same as it is described in > https://issues.apache.org/jira/browse/SOLR-14457 > > We saw that the problem happened also with correct GZIP responses from > jetty. Not only with non-GZIP as described within the jira issue. > > Best, > > Johannes > > Am Do., 23. Apr. 2020 um 09:55 Uhr schrieb Johannes Siegert < > johannes.sieg...@offerista.com>: > > > Hi, > > > > we want to use gzip-compression between our application and the solr > > server. > > > > We use a standalone solr server version 8.4.1 and the prepackaged jetty as > > application server. > > > > We have enabled the jetty gzip module by adding these two files: > > > > {path_to_solr}/server/modules/gzip.mod (see below the question) > > {path_to_solr}/server/etc/jetty-gzip.xml (see below the question) > > > > Within the application we use a HttpSolrServer that is configured with > > allowCompression=true. > > > > After we had released our application we saw that the number of > > connections within the TCP-state CLOSE_WAIT rising up until the application > > was not able to open new connections. > > > > > > After a long debugging session we think the problem is that the header > > "Content-Length" that is returned by the jetty is sometimes wrong when > > gzip-compression is enabled. > > > > The solrj client uses a ContentLengthInputStream, that uses the header > > "Content-Lenght" to detect if all data was received. But the InputStream > > can not be fully consumed because the value of the header "Content-Lenght" > > is higher than the actual content-length. > > > > Usually the method PoolingHttpClientConnectionManager.releaseConnection is > > called after the InputStream was fully consumed. This give the connection > > free to be reused or to be closed by the application. > > > > Due to the incorrect header "Content-Length" the > > PoolingHttpClientConnectionManager.releaseConnection method is never called > > and the connection stays active. After the connection-timeout of the jetty > > is reached, it closes the connection from the server-side and the TCP-state > > switches into CLOSE_WAIT. The client never closes the connection and so the > > number of connections in use rises up. > > > > > > Currently we try to configure the jetty gzip module to return no > > "Content-Length" if gzip-compression was used. We hope that in this case > > another InputStream implementation is used that uses the NULL-terminator to > > see when the InputStream was fully consumed. > > > > Do you have any experiences with this problem or any suggestions for us? > > > > Thanks, > > > > Johannes > > > > > > gzip.mod > > > > - > > > > DO NOT EDIT - See: > > https://www.eclipse.org/jetty/documentation/current/startup-modules.html > > > > [description] > > Enable GzipHandler for dynamic gzip compression > > for the entire server. > > > > [tags] > > handler > > > > [depend] > > server > > > > [xml] > > etc/jetty-gzip.xml > > > > [ini-template] > > ## Minimum content length after which gzip is enabled > > jetty.gzip.minGzipSize=2048 > > > > ## Check whether a file with *.gz extension exists > > jetty.gzip.checkGzExists=false > > > > ## Gzip compression level (-1 for default) > > jetty.gzip.compressionLevel=-1 > > > > ## User agents for which gzip is disabled > > jetty.gzip.excludedUserAgent=.*MSIE.6\.0.* > > > > - > > > > jetty-gzip.xml > > > > - > > > > > > > http://www.eclipse.org/jetty/configure_9_3.dtd;> > > > > > > > > > > > > > > > > > > > > > > > > > class="org.eclipse.jetty.server.handler.gzip.GzipHandler"> > > > > > deprecated="gzip.minGzipSize" default="2048" /> > > > > > > > deprecated="gzip.checkGzExists" default="false" /> > > > > > > > deprecated="gzip.compressionLevel" default="-1" /> > > > >
Re: gzip compression solr 8.4.1
Hi, We did further tests to see where the problem exactly is. These are our outcomes: The content-length is calculated correctly, a quick test with curl showed this. The problem is that the stream with the gzip data is not fully consumed and afterwards not closed. Using the debugger with a breakpoint at org/apache/solr/common/util/Utils.java:575 shows that it won't enter the function readFully((entity.getContent()) most likely due to how the gzip stream content is wrapped and extracted beforehand. On line org/apache/solr/common/util/Utils.java:582 the consumeQuietly(entity) should close the stream but does not because of a silent exception. This seems to be the same as it is described in https://issues.apache.org/jira/browse/SOLR-14457 We saw that the problem happened also with correct GZIP responses from jetty. Not only with non-GZIP as described within the jira issue. Best, Johannes Am Do., 23. Apr. 2020 um 09:55 Uhr schrieb Johannes Siegert < johannes.sieg...@offerista.com>: > Hi, > > we want to use gzip-compression between our application and the solr > server. > > We use a standalone solr server version 8.4.1 and the prepackaged jetty as > application server. > > We have enabled the jetty gzip module by adding these two files: > > {path_to_solr}/server/modules/gzip.mod (see below the question) > {path_to_solr}/server/etc/jetty-gzip.xml (see below the question) > > Within the application we use a HttpSolrServer that is configured with > allowCompression=true. > > After we had released our application we saw that the number of > connections within the TCP-state CLOSE_WAIT rising up until the application > was not able to open new connections. > > > After a long debugging session we think the problem is that the header > "Content-Length" that is returned by the jetty is sometimes wrong when > gzip-compression is enabled. > > The solrj client uses a ContentLengthInputStream, that uses the header > "Content-Lenght" to detect if all data was received. But the InputStream > can not be fully consumed because the value of the header "Content-Lenght" > is higher than the actual content-length. > > Usually the method PoolingHttpClientConnectionManager.releaseConnection is > called after the InputStream was fully consumed. This give the connection > free to be reused or to be closed by the application. > > Due to the incorrect header "Content-Length" the > PoolingHttpClientConnectionManager.releaseConnection method is never called > and the connection stays active. After the connection-timeout of the jetty > is reached, it closes the connection from the server-side and the TCP-state > switches into CLOSE_WAIT. The client never closes the connection and so the > number of connections in use rises up. > > > Currently we try to configure the jetty gzip module to return no > "Content-Length" if gzip-compression was used. We hope that in this case > another InputStream implementation is used that uses the NULL-terminator to > see when the InputStream was fully consumed. > > Do you have any experiences with this problem or any suggestions for us? > > Thanks, > > Johannes > > > gzip.mod > > - > > DO NOT EDIT - See: > https://www.eclipse.org/jetty/documentation/current/startup-modules.html > > [description] > Enable GzipHandler for dynamic gzip compression > for the entire server. > > [tags] > handler > > [depend] > server > > [xml] > etc/jetty-gzip.xml > > [ini-template] > ## Minimum content length after which gzip is enabled > jetty.gzip.minGzipSize=2048 > > ## Check whether a file with *.gz extension exists > jetty.gzip.checkGzExists=false > > ## Gzip compression level (-1 for default) > jetty.gzip.compressionLevel=-1 > > ## User agents for which gzip is disabled > jetty.gzip.excludedUserAgent=.*MSIE.6\.0.* > > - > > jetty-gzip.xml > > - > > > http://www.eclipse.org/jetty/configure_9_3.dtd;> > > > > > > > > > > > > class="org.eclipse.jetty.server.handler.gzip.GzipHandler"> > > deprecated="gzip.minGzipSize" default="2048" /> > > > deprecated="gzip.checkGzExists" default="false" /> > > > deprecated="gzip.compressionLevel" default="-1" /> > > > default="0" /> > > > default="-1" /> > > > /> > > > > > > deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*" /> > > > > > > default="GET,POST" /> >
gzip compression solr 8.4.1
Hi, we want to use gzip-compression between our application and the solr server. We use a standalone solr server version 8.4.1 and the prepackaged jetty as application server. We have enabled the jetty gzip module by adding these two files: {path_to_solr}/server/modules/gzip.mod (see below the question) {path_to_solr}/server/etc/jetty-gzip.xml (see below the question) Within the application we use a HttpSolrServer that is configured with allowCompression=true. After we had released our application we saw that the number of connections within the TCP-state CLOSE_WAIT rising up until the application was not able to open new connections. After a long debugging session we think the problem is that the header "Content-Length" that is returned by the jetty is sometimes wrong when gzip-compression is enabled. The solrj client uses a ContentLengthInputStream, that uses the header "Content-Lenght" to detect if all data was received. But the InputStream can not be fully consumed because the value of the header "Content-Lenght" is higher than the actual content-length. Usually the method PoolingHttpClientConnectionManager.releaseConnection is called after the InputStream was fully consumed. This give the connection free to be reused or to be closed by the application. Due to the incorrect header "Content-Length" the PoolingHttpClientConnectionManager.releaseConnection method is never called and the connection stays active. After the connection-timeout of the jetty is reached, it closes the connection from the server-side and the TCP-state switches into CLOSE_WAIT. The client never closes the connection and so the number of connections in use rises up. Currently we try to configure the jetty gzip module to return no "Content-Length" if gzip-compression was used. We hope that in this case another InputStream implementation is used that uses the NULL-terminator to see when the InputStream was fully consumed. Do you have any experiences with this problem or any suggestions for us? Thanks, Johannes gzip.mod - DO NOT EDIT - See: https://www.eclipse.org/jetty/documentation/current/startup-modules.html [description] Enable GzipHandler for dynamic gzip compression for the entire server. [tags] handler [depend] server [xml] etc/jetty-gzip.xml [ini-template] ## Minimum content length after which gzip is enabled jetty.gzip.minGzipSize=2048 ## Check whether a file with *.gz extension exists jetty.gzip.checkGzExists=false ## Gzip compression level (-1 for default) jetty.gzip.compressionLevel=-1 ## User agents for which gzip is disabled jetty.gzip.excludedUserAgent=.*MSIE.6\.0.* - jetty-gzip.xml - http://www.eclipse.org/jetty/configure_9_3.dtd;> -