[jira] [Commented] (NUTCH-1785) Ability to index raw content

2015-07-30 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648294#comment-14648294 ] Chris A. Mattmann commented on NUTCH-1785: -- +1 to commit from me. Ability to

[jira] [Resolved] (NUTCH-1785) Ability to index raw content

2015-07-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1785. - Resolution: Fixed Committed revision 1693507 Ability to index raw content

[jira] [Updated] (NUTCH-1785) Ability to index raw content

2015-07-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1785: Attachment: NUTCH-1785-trunkv2.patch This works perfectly for me locally. I would

[jira] [Comment Edited] (NUTCH-1785) Ability to index raw content

2015-07-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648249#comment-14648249 ] Lewis John McGibbney edited comment on NUTCH-1785 at 7/30/15 8:21 PM:

[jira] [Commented] (NUTCH-1785) Ability to index raw content

2015-07-30 Thread Thad Guidry (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648278#comment-14648278 ] Thad Guidry commented on NUTCH-1785: [~lewismc] No objections. It also worked

[jira] [Commented] (NUTCH-1785) Ability to index raw content

2015-07-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648380#comment-14648380 ] Hudson commented on NUTCH-1785: --- SUCCESS: Integrated in Nutch-trunk #3233 (See

[jira] [Updated] (NUTCH-2071) A parser failure on a single document may fail crawling job

2015-07-30 Thread Arkadi Kosmynin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arkadi Kosmynin updated NUTCH-2071: --- Attachment: NUTCH-2071.diff A parser failure on a single document may fail crawling job

[jira] [Updated] (NUTCH-2071) A parser failure on a single document may fail crawling job

2015-07-30 Thread Arkadi Kosmynin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arkadi Kosmynin updated NUTCH-2071: --- Flags: Patch Patch Info: Patch Available A parser failure on a single document may

[jira] [Created] (NUTCH-2071) A parser failure on a single document may fail crawling job

2015-07-30 Thread Arkadi Kosmynin (JIRA)
Arkadi Kosmynin created NUTCH-2071: -- Summary: A parser failure on a single document may fail crawling job Key: NUTCH-2071 URL: https://issues.apache.org/jira/browse/NUTCH-2071 Project: Nutch

[jira] [Commented] (NUTCH-2069) Ignore external links based on domain

2015-07-30 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647467#comment-14647467 ] Julien Nioche commented on NUTCH-2069: -- Hi [~wastl-nagel] and [~markus17]. BTW did

[jira] [Updated] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-07-30 Thread Tanguy Moal (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanguy Moal updated NUTCH-2072: --- Description: The method {{DeflateUtils.inflateBestEffort(byte[] in, int sizeLimit)}} is not designed

[jira] [Created] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-07-30 Thread Tanguy Moal (JIRA)
Tanguy Moal created NUTCH-2072: -- Summary: Deflate encoding support is broken when http.content.limit is set to -1 Key: NUTCH-2072 URL: https://issues.apache.org/jira/browse/NUTCH-2072 Project: Nutch

[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-07-30 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647385#comment-14647385 ] ASF GitHub Bot commented on NUTCH-2072: --- GitHub user tuxnco opened a pull request:

[jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1

2015-07-30 Thread Tanguy Moal (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647388#comment-14647388 ] Tanguy Moal commented on NUTCH-2072: I provided a dumb fix there:

[GitHub] nutch pull request: Fix for NUTCH-2072

2015-07-30 Thread tuxnco
GitHub user tuxnco opened a pull request: https://github.com/apache/nutch/pull/48 Fix for NUTCH-2072 {{HttpBase}} : mimic the behaviour of {{processGzipEncoded}} in {{processDeflateEncoded}} regarding the handling of the {{http.content.limit}} especially when it's negative