[
https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509469#comment-16509469
]
Omkar Reddy commented on NUTCH-2557:
A simple and wise solution. Thanks.
> protocol-http fails to
[
https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509573#comment-16509573
]
Jurian Broertjes commented on NUTCH-2565:
-
One solution would be to sum the retries of both
Hi guys,
I am porting Arch (https://www.atnf.csiro.au/computing/software/arch/) to Nutch
1.14 and Solr 7.2, and I have come across a few serious issues, of which you
should be aware:
1. The Nutch-2071 is still an issue in 1.14, because the returned
parseResult is never null. If a
[
https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509766#comment-16509766
]
ASF GitHub Bot commented on NUTCH-2576:
---
sebastian-nagel closed pull request #328: NUTCH-2576 HTTP
[
https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2576.
Resolution: Implemented
> HTTP protocol plugin based on okhttp
>
[
https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2576 started by Sebastian Nagel.
--
> HTTP protocol plugin based on okhttp
>
[
https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2576:
--
Assignee: Sebastian Nagel
> HTTP protocol plugin based on okhttp
>
[
https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509785#comment-16509785
]
Hudson commented on NUTCH-2595:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3533 (See
[
https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2595:
--
Assignee: Sebastian Nagel
> Upgrade crawler-commons dependency to 0.10
>
[
https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2595.
Resolution: Implemented
> Upgrade crawler-commons dependency to 0.10
>
[
https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509763#comment-16509763
]
ASF GitHub Bot commented on NUTCH-2595:
---
sebastian-nagel closed pull request #345: NUTCH-2595
[
https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509768#comment-16509768
]
ASF GitHub Bot commented on NUTCH-2040:
---
sebastian-nagel closed pull request #346: NUTCH-2040
[
https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2040.
Resolution: Implemented
> Upgrade to recent version of Crawler-Commons
>
[
https://issues.apache.org/jira/browse/NUTCH-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509779#comment-16509779
]
ASF GitHub Bot commented on NUTCH-2012:
---
sju opened a new pull request #348: NUTCH-2012: output fix
[
https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509786#comment-16509786
]
Hudson commented on NUTCH-2576:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3533 (See
[
https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509801#comment-16509801
]
Jurian Broertjes commented on NUTCH-2565:
-
Maybe it would be sufficient to only test on
[
https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509804#comment-16509804
]
ASF GitHub Bot commented on NUTCH-2549:
---
sebastian-nagel closed pull request #347: NUTCH-2549
[
https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2032:
---
Fix Version/s: (was: 1.15)
> Plugin to index the raw content of a readable document.
>
[
https://issues.apache.org/jira/browse/NUTCH-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2239:
---
Fix Version/s: (was: 1.15)
> Selenium Handlers for Ajax Patterns from Student
[
https://issues.apache.org/jira/browse/NUTCH-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2512:
---
Fix Version/s: (was: 1.15)
1.16
> Nutch does not build under JDK9
>
[
https://issues.apache.org/jira/browse/NUTCH-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2382:
---
Fix Version/s: (was: 1.15)
1.16
> indexer-hbase Nutch 1.x branch
>
[
https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2549.
Resolution: Fixed
Thanks, [~gbouchar] for the careful analysis!
> protocol-http does not
[
https://issues.apache.org/jira/browse/NUTCH-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2369:
---
Fix Version/s: (was: 1.15)
> Create a new GraphGenerator Tool for writing Nutch Records
[
https://issues.apache.org/jira/browse/NUTCH-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510075#comment-16510075
]
Sebastian Nagel commented on NUTCH-2140:
Hi [~roannel], is this still a requirement or is it
[
https://issues.apache.org/jira/browse/NUTCH-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2249:
---
Fix Version/s: (was: 1.15)
> WordNet Integration for Cosine Similarity
>
[
https://issues.apache.org/jira/browse/NUTCH-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2561.
Resolution: Fixed
Thanks, [~gbouchar], esp. for the idea for the unit test server.
>
[
https://issues.apache.org/jira/browse/NUTCH-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2563.
Resolution: Fixed
> HTTP header spellchecking issues
>
>
[
https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2557.
Resolution: Fixed
Thanks, [~gbouchar] and [~omkar20895]!
> protocol-http fails to follow
[
https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510069#comment-16510069
]
Sebastian Nagel commented on NUTCH-2030:
So, it's about parse-zip or the "lang" field defined in
[
https://issues.apache.org/jira/browse/NUTCH-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2312.
Resolution: Incomplete
Fix Version/s: (was: 1.15)
No patch/PR provided so far.
[
https://issues.apache.org/jira/browse/NUTCH-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510090#comment-16510090
]
Sebastian Nagel commented on NUTCH-2239:
Hi [~chrismattmann], still in progress?
> Selenium
[
https://issues.apache.org/jira/browse/NUTCH-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2265:
---
Fix Version/s: (was: 1.15)
> Write A Test Package for Scoring Similarity
>
[
https://issues.apache.org/jira/browse/NUTCH-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2555.
Resolution: Fixed
> URL normalization problem: path not starting with a '/'
>
[
https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2560.
Resolution: Cannot Reproduce
Thanks, [~gbouchar]. There is now a unit test for multi-line
Hi Sebastian,
Sorry, clarifying my objectives:
I am not frustrated, just trying to help. I did not write this message to
request fixes for Arch. All these issues have been fixed in Arch, except
perhaps the native library issue, but I may fix it as well, if lucky enough. I
wrote that message
[
https://issues.apache.org/jira/browse/NUTCH-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2556.
Resolution: Fixed
HTTP/1.1 is now the default for protocol-http but setting http.useHttp11
[
https://issues.apache.org/jira/browse/NUTCH-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2334:
---
Fix Version/s: (was: 1.15)
1.16
> Extension point for schedulers
>
[
https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2030:
---
Fix Version/s: (was: 1.15)
1.16
> ParseZip plugin is not able to
[
https://issues.apache.org/jira/browse/NUTCH-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2267:
---
Fix Version/s: (was: 1.15)
> Solr indexer fails at the end of the job with a java error
[
https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2209:
---
Fix Version/s: (was: 1.15)
> Improved Tokenization for Similarity Scoring plugin
>
[
https://issues.apache.org/jira/browse/NUTCH-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510086#comment-16510086
]
Sebastian Nagel commented on NUTCH-2382:
After NUTCH-1480 the patch needs to be updated. Moving
[
https://issues.apache.org/jira/browse/NUTCH-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2251.
Resolution: Duplicate
Fix Version/s: (was: 1.15)
> Make
[
https://issues.apache.org/jira/browse/NUTCH-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2558.
Resolution: Fixed
> protocol-http cannot handle a missing HTTP status line
>
[
https://issues.apache.org/jira/browse/NUTCH-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2564.
Resolution: Fixed
> protocol-http throws an error when the content-length header is not a
[
https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2292:
---
Fix Version/s: (was: 1.15)
1.16
> Mavenize the build for nutch-core
[
https://issues.apache.org/jira/browse/NUTCH-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2559.
Resolution: Fixed
> protocol-http cannot handle colons after the HTTP status code
>
[
https://issues.apache.org/jira/browse/NUTCH-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2147:
---
Fix Version/s: (was: 1.15)
> MetadataScoringFilter for Nutch
>
[
https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509596#comment-16509596
]
Sebastian Nagel commented on NUTCH-2565:
I thought first about making the condition in
[
https://issues.apache.org/jira/browse/NUTCH-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509686#comment-16509686
]
Jurian Broertjes commented on NUTCH-2012:
-
It looks like the process() function still uses
Hi Arkadi,
thanks for your feedback and suggestions.
I can understand your frustration but I also want to clarify:
- Arch is a nice project, for sure. But Arch is GPL licensed
which makes contributions a one-way route (Nutch -> Arch)
and causes me even not to look into the Arch sources.
[
https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509848#comment-16509848
]
Hudson commented on NUTCH-2040:
---
SUCCESS: Integrated in Jenkins build Nutch-nutchgora #1612 (See
[
https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509864#comment-16509864
]
Hudson commented on NUTCH-2549:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509868#comment-16509868
]
Hudson commented on NUTCH-2559:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509870#comment-16509870
]
Hudson commented on NUTCH-2563:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509869#comment-16509869
]
Hudson commented on NUTCH-2558:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509867#comment-16509867
]
Hudson commented on NUTCH-2564:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509871#comment-16509871
]
Hudson commented on NUTCH-2557:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509872#comment-16509872
]
Hudson commented on NUTCH-2560:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509865#comment-16509865
]
Hudson commented on NUTCH-2555:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
[
https://issues.apache.org/jira/browse/NUTCH-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509866#comment-16509866
]
Hudson commented on NUTCH-2556:
---
SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See
60 matches
Mail list logo