[jira] [Commented] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509469#comment-16509469 ] Omkar Reddy commented on NUTCH-2557: A simple and wise solution. Thanks. > protocol-http fails to

[jira] [Commented] (NUTCH-2565) MergeDB incorrectly handles unfetched CrawlDatums

2018-06-12 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509573#comment-16509573 ] Jurian Broertjes commented on NUTCH-2565: - One solution would be to sum the retries of both

Nutch 1.14 issues

2018-06-12 Thread Arkadi.Kosmynin
Hi guys, I am porting Arch (https://www.atnf.csiro.au/computing/software/arch/) to Nutch 1.14 and Solr 7.2, and I have come across a few serious issues, of which you should be aware: 1. The Nutch-2071 is still an issue in 1.14, because the returned parseResult is never null. If a

[jira] [Commented] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509766#comment-16509766 ] ASF GitHub Bot commented on NUTCH-2576: --- sebastian-nagel closed pull request #328: NUTCH-2576 HTTP

[jira] [Resolved] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2576. Resolution: Implemented > HTTP protocol plugin based on okhttp >

[jira] [Work started] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2576 started by Sebastian Nagel. -- > HTTP protocol plugin based on okhttp >

[jira] [Assigned] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-2576: -- Assignee: Sebastian Nagel > HTTP protocol plugin based on okhttp >

[jira] [Commented] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509785#comment-16509785 ] Hudson commented on NUTCH-2595: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3533 (See

[jira] [Assigned] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-2595: -- Assignee: Sebastian Nagel > Upgrade crawler-commons dependency to 0.10 >

[jira] [Resolved] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2595. Resolution: Implemented > Upgrade crawler-commons dependency to 0.10 >

[jira] [Commented] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509763#comment-16509763 ] ASF GitHub Bot commented on NUTCH-2595: --- sebastian-nagel closed pull request #345: NUTCH-2595

[jira] [Commented] (NUTCH-2040) Upgrade to recent version of Crawler-Commons

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509768#comment-16509768 ] ASF GitHub Bot commented on NUTCH-2040: --- sebastian-nagel closed pull request #346: NUTCH-2040

[jira] [Resolved] (NUTCH-2040) Upgrade to recent version of Crawler-Commons

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2040. Resolution: Implemented > Upgrade to recent version of Crawler-Commons >

[jira] [Commented] (NUTCH-2012) Merge parsechecker and indexchecker

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509779#comment-16509779 ] ASF GitHub Bot commented on NUTCH-2012: --- sju opened a new pull request #348: NUTCH-2012: output fix

[jira] [Commented] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509786#comment-16509786 ] Hudson commented on NUTCH-2576: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3533 (See

[jira] [Commented] (NUTCH-2565) MergeDB incorrectly handles unfetched CrawlDatums

2018-06-12 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509801#comment-16509801 ] Jurian Broertjes commented on NUTCH-2565: - Maybe it would be sufficient to only test on

[jira] [Commented] (NUTCH-2549) protocol-http does not behave the same as browsers

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509804#comment-16509804 ] ASF GitHub Bot commented on NUTCH-2549: --- sebastian-nagel closed pull request #347: NUTCH-2549

[jira] [Updated] (NUTCH-2032) Plugin to index the raw content of a readable document.

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2032: --- Fix Version/s: (was: 1.15) > Plugin to index the raw content of a readable document. >

[jira] [Updated] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2239: --- Fix Version/s: (was: 1.15) > Selenium Handlers for Ajax Patterns from Student

[jira] [Updated] (NUTCH-2512) Nutch does not build under JDK9

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2512: --- Fix Version/s: (was: 1.15) 1.16 > Nutch does not build under JDK9 >

[jira] [Updated] (NUTCH-2382) indexer-hbase Nutch 1.x branch

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2382: --- Fix Version/s: (was: 1.15) 1.16 > indexer-hbase Nutch 1.x branch >

[jira] [Resolved] (NUTCH-2549) protocol-http does not behave the same as browsers

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2549. Resolution: Fixed Thanks, [~gbouchar] for the careful analysis! > protocol-http does not

[jira] [Updated] (NUTCH-2369) Create a new GraphGenerator Tool for writing Nutch Records as a Full Web Graph

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2369: --- Fix Version/s: (was: 1.15) > Create a new GraphGenerator Tool for writing Nutch Records

[jira] [Commented] (NUTCH-2140) Atomic update and optimistic concurrency update using Solr

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510075#comment-16510075 ] Sebastian Nagel commented on NUTCH-2140: Hi [~roannel], is this still a requirement or is it

[jira] [Updated] (NUTCH-2249) WordNet Integration for Cosine Similarity

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2249: --- Fix Version/s: (was: 1.15) > WordNet Integration for Cosine Similarity >

[jira] [Resolved] (NUTCH-2561) protocol-http can be made to read arbitrarily large HTTP responses

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2561. Resolution: Fixed Thanks, [~gbouchar], esp. for the idea for the unit test server. >

[jira] [Resolved] (NUTCH-2563) HTTP header spellchecking issues

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2563. Resolution: Fixed > HTTP header spellchecking issues > >

[jira] [Resolved] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2557. Resolution: Fixed Thanks, [~gbouchar] and [~omkar20895]! > protocol-http fails to follow

[jira] [Commented] (NUTCH-2030) ParseZip plugin is not able to extract language from zip document,this could solve that problem.

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510069#comment-16510069 ] Sebastian Nagel commented on NUTCH-2030: So, it's about parse-zip or the "lang" field defined in

[jira] [Resolved] (NUTCH-2312) Support PhantomJS as a WebDriver in protocol-selenium

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2312. Resolution: Incomplete Fix Version/s: (was: 1.15) No patch/PR provided so far.

[jira] [Commented] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510090#comment-16510090 ] Sebastian Nagel commented on NUTCH-2239: Hi [~chrismattmann], still in progress? > Selenium

[jira] [Updated] (NUTCH-2265) Write A Test Package for Scoring Similarity

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2265: --- Fix Version/s: (was: 1.15) > Write A Test Package for Scoring Similarity >

[jira] [Resolved] (NUTCH-2555) URL normalization problem: path not starting with a '/'

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2555. Resolution: Fixed > URL normalization problem: path not starting with a '/' >

[jira] [Resolved] (NUTCH-2560) protocol-http throws an error when an http header spans over multiple lines

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2560. Resolution: Cannot Reproduce Thanks, [~gbouchar]. There is now a unit test for multi-line

[jira] [Resolved] (NUTCH-2267) Solr indexer fails at the end of the job with a java error message

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2267. Resolution: Done PR has been merged. Closing this for now. Thanks to everyone involved! >

[jira] [Resolved] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2209. Resolution: Done This has been already committed (pull request merged) for Nutch 1.12. >

Re: Nutch 1.14 issues

2018-06-12 Thread Arkadi.Kosmynin
Hi Sebastian, Sorry, clarifying my objectives: I am not frustrated, just trying to help. I did not write this message to request fixes for Arch. All these issues have been fixed in Arch, except perhaps the native library issue, but I may fix it as well, if lucky enough. I wrote that message

[jira] [Resolved] (NUTCH-2556) protocol-http makes invalid HTTP/1.0 requests

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2556. Resolution: Fixed HTTP/1.1 is now the default for protocol-http but setting http.useHttp11

[jira] [Updated] (NUTCH-2334) Extension point for schedulers

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2334: --- Fix Version/s: (was: 1.15) 1.16 > Extension point for schedulers >

[jira] [Updated] (NUTCH-2030) ParseZip plugin is not able to extract language from zip document,this could solve that problem.

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2030: --- Fix Version/s: (was: 1.15) 1.16 > ParseZip plugin is not able to

[jira] [Updated] (NUTCH-2267) Solr indexer fails at the end of the job with a java error message

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2267: --- Fix Version/s: (was: 1.15) > Solr indexer fails at the end of the job with a java error

[jira] [Updated] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2209: --- Fix Version/s: (was: 1.15) > Improved Tokenization for Similarity Scoring plugin >

[jira] [Commented] (NUTCH-2382) indexer-hbase Nutch 1.x branch

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510086#comment-16510086 ] Sebastian Nagel commented on NUTCH-2382: After NUTCH-1480 the patch needs to be updated. Moving

[jira] [Resolved] (NUTCH-2251) Make CommonCrawlFormatJackson instance reusable by properly handling object state

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2251. Resolution: Duplicate Fix Version/s: (was: 1.15) > Make

[jira] [Resolved] (NUTCH-2558) protocol-http cannot handle a missing HTTP status line

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2558. Resolution: Fixed > protocol-http cannot handle a missing HTTP status line >

[jira] [Resolved] (NUTCH-2564) protocol-http throws an error when the content-length header is not a number

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2564. Resolution: Fixed > protocol-http throws an error when the content-length header is not a

[jira] [Updated] (NUTCH-2292) Mavenize the build for nutch-core and nutch-plugins

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2292: --- Fix Version/s: (was: 1.15) 1.16 > Mavenize the build for nutch-core

[jira] [Resolved] (NUTCH-2559) protocol-http cannot handle colons after the HTTP status code

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2559. Resolution: Fixed > protocol-http cannot handle colons after the HTTP status code >

[jira] [Updated] (NUTCH-2147) MetadataScoringFilter for Nutch

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2147: --- Fix Version/s: (was: 1.15) > MetadataScoringFilter for Nutch >

[jira] [Commented] (NUTCH-2565) MergeDB incorrectly handles unfetched CrawlDatums

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509596#comment-16509596 ] Sebastian Nagel commented on NUTCH-2565: I thought first about making the condition in

[jira] [Commented] (NUTCH-2012) Merge parsechecker and indexchecker

2018-06-12 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509686#comment-16509686 ] Jurian Broertjes commented on NUTCH-2012: - It looks like the process() function still uses

Re: Nutch 1.14 issues

2018-06-12 Thread Sebastian Nagel
Hi Arkadi, thanks for your feedback and suggestions. I can understand your frustration but I also want to clarify: - Arch is a nice project, for sure. But Arch is GPL licensed which makes contributions a one-way route (Nutch -> Arch) and causes me even not to look into the Arch sources.

[jira] [Commented] (NUTCH-2040) Upgrade to recent version of Crawler-Commons

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509848#comment-16509848 ] Hudson commented on NUTCH-2040: --- SUCCESS: Integrated in Jenkins build Nutch-nutchgora #1612 (See

[jira] [Commented] (NUTCH-2549) protocol-http does not behave the same as browsers

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509864#comment-16509864 ] Hudson commented on NUTCH-2549: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2559) protocol-http cannot handle colons after the HTTP status code

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509868#comment-16509868 ] Hudson commented on NUTCH-2559: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2563) HTTP header spellchecking issues

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509870#comment-16509870 ] Hudson commented on NUTCH-2563: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2558) protocol-http cannot handle a missing HTTP status line

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509869#comment-16509869 ] Hudson commented on NUTCH-2558: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2564) protocol-http throws an error when the content-length header is not a number

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509867#comment-16509867 ] Hudson commented on NUTCH-2564: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509871#comment-16509871 ] Hudson commented on NUTCH-2557: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2560) protocol-http throws an error when an http header spans over multiple lines

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509872#comment-16509872 ] Hudson commented on NUTCH-2560: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2555) URL normalization problem: path not starting with a '/'

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509865#comment-16509865 ] Hudson commented on NUTCH-2555: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See

[jira] [Commented] (NUTCH-2556) protocol-http makes invalid HTTP/1.0 requests

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509866#comment-16509866 ] Hudson commented on NUTCH-2556: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See