[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651294#comment-14651294 ] Hudson commented on NUTCH-2072: --- SUCCESS: Integrated in Nutch-trunk #3237 (See [https://builds.apache.org/job/Nutch-trunk/3237/]) Fix for NUTCH-2072: Deflate encoding support is broken when http.content.limit is set to -1 contributed by Tanguy Moal tan...@cogniteev.com this closes #48. (mattmann: http://svn.apache.org/viewvc/nutch/trunk/?view=revrev=1693843) * /nutch/trunk/CHANGES.txt * /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java Deflate encoding support is broken when http.content.limit is set to -1 --- Key: NUTCH-2072 URL: https://issues.apache.org/jira/browse/NUTCH-2072 Project: Nutch Issue Type: Bug Components: plugin, protocol Reporter: Tanguy Moal Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.11 The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is not designed to have sizeLimit set to a negative value. The fix can be simply to mimic what's done with gzip encoding : if {{getMaxContent() 0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} argument. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651277#comment-14651277 ] Chris A. Mattmann commented on NUTCH-2072: -- Tests pass: {noformat} copy-generated-lib: test: [echo] Testing plugin: urlnormalizer-slash [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/usr/local/Cellar/ant/1.9.4/libexec/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/Users/mattmann/tmp/nutch-trunk/build/test/lib/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.nutch.net.urlnormalizer.slash.TestSlashURLNormalizer [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.055 sec [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.856 sec [junit] Running org.apache.nutch.tika.TestRTFParser [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.125 sec [junit] Running org.apache.nutch.tika.TestRobotsMetaProcessor [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 17.994 sec BUILD SUCCESSFUL Total time: 13 minutes 21 seconds {noformat} Committing this now. Thanks. Deflate encoding support is broken when http.content.limit is set to -1 --- Key: NUTCH-2072 URL: https://issues.apache.org/jira/browse/NUTCH-2072 Project: Nutch Issue Type: Bug Components: plugin, protocol Reporter: Tanguy Moal Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.11 The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is not designed to have sizeLimit set to a negative value. The fix can be simply to mimic what's done with gzip encoding : if {{getMaxContent() 0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} argument. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647385#comment-14647385 ] ASF GitHub Bot commented on NUTCH-2072: --- GitHub user tuxnco opened a pull request: https://github.com/apache/nutch/pull/48 Fix for NUTCH-2072 {{HttpBase}} : mimic the behaviour of {{processGzipEncoded}} in {{processDeflateEncoded}} regarding the handling of the {{http.content.limit}} especially when it's negative (unlimited). You can merge this pull request into a Git repository by running: $ git pull https://github.com/cogniteev/nutch trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/48.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #48 commit e5a0a0943b91a64ee0cd71314546f0876df7789b Author: Tanguy Moal tan...@cogniteev.com Date: 2015-07-30T09:08:40Z HttpBase: fix bug when http.content.limit is set to -1 and remote server uses deflate encoding Deflate encoding support is broken when http.content.limit is set to -1 --- Key: NUTCH-2072 URL: https://issues.apache.org/jira/browse/NUTCH-2072 Project: Nutch Issue Type: Bug Components: plugin, protocol Reporter: Tanguy Moal Priority: Minor The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is not designed to have sizeLimit set to a negative value. The fix can be simply to mimic what's done with gzip encoding : if {{getMaxContent() 0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} argument. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647388#comment-14647388 ] Tanguy Moal commented on NUTCH-2072: I provided a dumb fix there: https://github.com/apache/nutch/pull/48 . I couldn't find any test regarding handling of HTTP compression and {{http.content.limit}} parameter, and setting those seems tedious. Feel free to guide me if we want to make that part more robust. Deflate encoding support is broken when http.content.limit is set to -1 --- Key: NUTCH-2072 URL: https://issues.apache.org/jira/browse/NUTCH-2072 Project: Nutch Issue Type: Bug Components: plugin, protocol Reporter: Tanguy Moal Priority: Minor The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is not designed to have sizeLimit set to a negative value. The fix can be simply to mimic what's done with gzip encoding : if {{getMaxContent() 0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} argument. -- This message was sent by Atlassian JIRA (v6.3.4#6332)