[jira] [Updated] (NUTCH-1270) some of Deflate encoded pages not fetched

2014-07-15 Thread Julien Nioche (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Nioche updated NUTCH-1270:
-

Fix Version/s: (was: 1.9)

 some of Deflate encoded pages not fetched
 -

 Key: NUTCH-1270
 URL: https://issues.apache.org/jira/browse/NUTCH-1270
 Project: Nutch
  Issue Type: Bug
  Components: protocol
Affects Versions: 1.4
 Environment: software
Reporter: behnam nikbakht
  Labels: fetch, processDeflateEncoded
 Attachments: NUTCH-1270.patch


 it is a problem with some of web pages that fetched but their content can not 
 retrived
 after this change, this error fixed
 we change lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
   public byte[] processDeflateEncoded(byte[] compressed, URL url) throws 
 IOException {
 if (LOGGER.isTraceEnabled()) { LOGGER.trace(inflating); }
 byte[] content = DeflateUtils.inflateBestEffort(compressed, 
 getMaxContent());
 +if(content==null)
 + content = DeflateUtils.inflateBestEffort(compressed, 20);
 if (content == null)
   throw new IOException(inflateBestEffort returned null);
 if (LOGGER.isTraceEnabled()) {
   LOGGER.trace(fetched  + compressed.length
  +  bytes of compressed content (expanded to 
  + content.length +  bytes) from  + url);
 }
 return content;
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (NUTCH-1270) some of Deflate encoded pages not fetched

2014-04-18 Thread Julien Nioche (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Nioche updated NUTCH-1270:
-

Component/s: (was: fetcher)
 protocol

 some of Deflate encoded pages not fetched
 -

 Key: NUTCH-1270
 URL: https://issues.apache.org/jira/browse/NUTCH-1270
 Project: Nutch
  Issue Type: Bug
  Components: protocol
Affects Versions: 1.4
 Environment: software
Reporter: behnam nikbakht
  Labels: fetch, processDeflateEncoded
 Fix For: 1.9

 Attachments: NUTCH-1270.patch


 it is a problem with some of web pages that fetched but their content can not 
 retrived
 after this change, this error fixed
 we change lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
   public byte[] processDeflateEncoded(byte[] compressed, URL url) throws 
 IOException {
 if (LOGGER.isTraceEnabled()) { LOGGER.trace(inflating); }
 byte[] content = DeflateUtils.inflateBestEffort(compressed, 
 getMaxContent());
 +if(content==null)
 + content = DeflateUtils.inflateBestEffort(compressed, 20);
 if (content == null)
   throw new IOException(inflateBestEffort returned null);
 if (LOGGER.isTraceEnabled()) {
   LOGGER.trace(fetched  + compressed.length
  +  bytes of compressed content (expanded to 
  + content.length +  bytes) from  + url);
 }
 return content;
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (NUTCH-1270) some of Deflate encoded pages not fetched

2013-01-12 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1270:


   Patch Info: Patch Available
Fix Version/s: 1.7

 some of Deflate encoded pages not fetched
 -

 Key: NUTCH-1270
 URL: https://issues.apache.org/jira/browse/NUTCH-1270
 Project: Nutch
  Issue Type: Bug
  Components: fetcher
Affects Versions: 1.4
 Environment: software
Reporter: behnam nikbakht
  Labels: fetch, processDeflateEncoded
 Fix For: 1.7

 Attachments: NUTCH-1270.patch


 it is a problem with some of web pages that fetched but their content can not 
 retrived
 after this change, this error fixed
 we change lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
   public byte[] processDeflateEncoded(byte[] compressed, URL url) throws 
 IOException {
 if (LOGGER.isTraceEnabled()) { LOGGER.trace(inflating); }
 byte[] content = DeflateUtils.inflateBestEffort(compressed, 
 getMaxContent());
 +if(content==null)
 + content = DeflateUtils.inflateBestEffort(compressed, 20);
 if (content == null)
   throw new IOException(inflateBestEffort returned null);
 if (LOGGER.isTraceEnabled()) {
   LOGGER.trace(fetched  + compressed.length
  +  bytes of compressed content (expanded to 
  + content.length +  bytes) from  + url);
 }
 return content;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (NUTCH-1270) some of Deflate encoded pages not fetched

2012-03-03 Thread behnam nikbakht (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

behnam nikbakht updated NUTCH-1270:
---

Attachment: NUTCH-1270.patch

 some of Deflate encoded pages not fetched
 -

 Key: NUTCH-1270
 URL: https://issues.apache.org/jira/browse/NUTCH-1270
 Project: Nutch
  Issue Type: Bug
  Components: fetcher
Affects Versions: 1.4
 Environment: software
Reporter: behnam nikbakht
  Labels: fetch, processDeflateEncoded
 Attachments: NUTCH-1270.patch


 it is a problem with some of web pages that fetched but their content can not 
 retrived
 after this change, this error fixed
 we change lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
   public byte[] processDeflateEncoded(byte[] compressed, URL url) throws 
 IOException {
 if (LOGGER.isTraceEnabled()) { LOGGER.trace(inflating); }
 byte[] content = DeflateUtils.inflateBestEffort(compressed, 
 getMaxContent());
 +if(content==null)
 + content = DeflateUtils.inflateBestEffort(compressed, 20);
 if (content == null)
   throw new IOException(inflateBestEffort returned null);
 if (LOGGER.isTraceEnabled()) {
   LOGGER.trace(fetched  + compressed.length
  +  bytes of compressed content (expanded to 
  + content.length +  bytes) from  + url);
 }
 return content;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira