[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-08-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651294#comment-14651294
 ] 

Hudson commented on NUTCH-2072:
---

SUCCESS: Integrated in Nutch-trunk #3237 (See 
[https://builds.apache.org/job/Nutch-trunk/3237/])
Fix for NUTCH-2072: Deflate encoding support is broken when http.content.limit 
is set to -1 contributed by Tanguy Moal tan...@cogniteev.com this closes #48. 
(mattmann: http://svn.apache.org/viewvc/nutch/trunk/?view=revrev=1693843)
* /nutch/trunk/CHANGES.txt
* 
/nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java


 Deflate encoding support is broken when http.content.limit is set to -1
 ---

 Key: NUTCH-2072
 URL: https://issues.apache.org/jira/browse/NUTCH-2072
 Project: Nutch
  Issue Type: Bug
  Components: plugin, protocol
Reporter: Tanguy Moal
Assignee: Chris A. Mattmann
Priority: Minor
 Fix For: 1.11


 The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is 
 not designed to have sizeLimit set to a negative value.
 The fix can be simply to mimic what's done with gzip encoding : if 
 {{getMaxContent()  0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} 
 argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-08-02 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651277#comment-14651277
 ] 

Chris A. Mattmann commented on NUTCH-2072:
--

Tests pass:

{noformat}

copy-generated-lib:

test:
 [echo] Testing plugin: urlnormalizer-slash
[junit] WARNING: multiple versions of ant detected in path for junit 
[junit]  
jar:file:/usr/local/Cellar/ant/1.9.4/libexec/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit]  and 
jar:file:/Users/mattmann/tmp/nutch-trunk/build/test/lib/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Running 
org.apache.nutch.net.urlnormalizer.slash.TestSlashURLNormalizer
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.055 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
11.856 sec
[junit] Running org.apache.nutch.tika.TestRTFParser
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
0.125 sec
[junit] Running org.apache.nutch.tika.TestRobotsMetaProcessor
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
17.994 sec

BUILD SUCCESSFUL
Total time: 13 minutes 21 seconds
{noformat}

Committing this now. Thanks.


 Deflate encoding support is broken when http.content.limit is set to -1
 ---

 Key: NUTCH-2072
 URL: https://issues.apache.org/jira/browse/NUTCH-2072
 Project: Nutch
  Issue Type: Bug
  Components: plugin, protocol
Reporter: Tanguy Moal
Assignee: Chris A. Mattmann
Priority: Minor
 Fix For: 1.11


 The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is 
 not designed to have sizeLimit set to a negative value.
 The fix can be simply to mimic what's done with gzip encoding : if 
 {{getMaxContent()  0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} 
 argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-07-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647385#comment-14647385
 ] 

ASF GitHub Bot commented on NUTCH-2072:
---

GitHub user tuxnco opened a pull request:

https://github.com/apache/nutch/pull/48

Fix for NUTCH-2072

{{HttpBase}} : mimic the behaviour of {{processGzipEncoded}} in 
{{processDeflateEncoded}} regarding the handling of the {{http.content.limit}} 
especially when it's negative (unlimited).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cogniteev/nutch trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nutch/pull/48.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #48


commit e5a0a0943b91a64ee0cd71314546f0876df7789b
Author: Tanguy Moal tan...@cogniteev.com
Date:   2015-07-30T09:08:40Z

HttpBase: fix bug when http.content.limit is set to -1 and remote server 
uses deflate encoding




 Deflate encoding support is broken when http.content.limit is set to -1
 ---

 Key: NUTCH-2072
 URL: https://issues.apache.org/jira/browse/NUTCH-2072
 Project: Nutch
  Issue Type: Bug
  Components: plugin, protocol
Reporter: Tanguy Moal
Priority: Minor

 The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is 
 not designed to have sizeLimit set to a negative value.
 The fix can be simply to mimic what's done with gzip encoding : if 
 {{getMaxContent()  0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} 
 argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-07-30 Thread Tanguy Moal (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647388#comment-14647388
 ] 

Tanguy Moal commented on NUTCH-2072:


I provided a dumb fix there: https://github.com/apache/nutch/pull/48 .

I couldn't find any test regarding handling of HTTP compression and 
{{http.content.limit}} parameter, and setting those seems tedious. Feel free to 
guide me if we want to make that part more robust.

 Deflate encoding support is broken when http.content.limit is set to -1
 ---

 Key: NUTCH-2072
 URL: https://issues.apache.org/jira/browse/NUTCH-2072
 Project: Nutch
  Issue Type: Bug
  Components: plugin, protocol
Reporter: Tanguy Moal
Priority: Minor

 The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is 
 not designed to have sizeLimit set to a negative value.
 The fix can be simply to mimic what's done with gzip encoding : if 
 {{getMaxContent()  0}} then use {{Integer.MAX_VALUE}} for the {{sizeLimit}} 
 argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)