[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails

2024-05-28 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850039#comment-17850039 ] Hudson commented on NUTCH-3044: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #163 (See

[jira] [Commented] (NUTCH-3055) README: fix Github "hub" commands

2024-05-28 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850040#comment-17850040 ] Hudson commented on NUTCH-3055: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #163 (See

[jira] [Commented] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters

2024-05-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846795#comment-17846795 ] Hudson commented on NUTCH-3041: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #162 (See

[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters

2024-05-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846402#comment-17846402 ] Hudson commented on NUTCH-3043: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #161 (See

[jira] [Commented] (NUTCH-3039) Failure to handle ftp:// URLs

2024-05-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846401#comment-17846401 ] Hudson commented on NUTCH-3039: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #161 (See

[jira] [Commented] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions

2024-04-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842426#comment-17842426 ] Hudson commented on NUTCH-3054: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #160 (See

[jira] [Commented] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun

2024-04-08 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835083#comment-17835083 ] Hudson commented on NUTCH-3038: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #157 (See

[jira] [Commented] (NUTCH-3032) Indexing plugin as an adapter for end user's own POJO instances

2024-04-04 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834020#comment-17834020 ] Hudson commented on NUTCH-3032: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #156 (See

[jira] [Commented] (NUTCH-3036) Upgrade org.seleniumhq.selenium:selenium-java dependency in lib-selenium

2024-03-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832486#comment-17832486 ] Hudson commented on NUTCH-3036: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #155 (See

[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20

2024-03-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832485#comment-17832485 ] Hudson commented on NUTCH-3035: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #155 (See

[jira] [Commented] (NUTCH-3008) indexer-elastic: downgrade to ES 7.10.2 to address licensing issues

2024-03-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827093#comment-17827093 ] Hudson commented on NUTCH-3008: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #154 (See

[jira] [Commented] (NUTCH-3029) Host specific max. and min. intervals in adaptive scheduler

2024-03-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827060#comment-17827060 ] Hudson commented on NUTCH-3029: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #153 (See

[jira] [Commented] (NUTCH-3029) Host specific max. and min. intervals in adaptive scheduler

2024-03-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826846#comment-17826846 ] Hudson commented on NUTCH-3029: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #152 (See

[jira] [Commented] (NUTCH-3029) Host specific max. and min. intervals in adaptive scheduler

2024-03-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826806#comment-17826806 ] Hudson commented on NUTCH-3029: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #151 (See

[jira] [Commented] (NUTCH-3033) Upgrade Ivy to v2.5.2

2024-03-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826773#comment-17826773 ] Hudson commented on NUTCH-3033: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #150 (See

[jira] [Commented] (NUTCH-3029) Host specific max. and min. intervals in adaptive scheduler

2024-03-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826772#comment-17826772 ] Hudson commented on NUTCH-3029: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #150 (See

[jira] [Commented] (NUTCH-3029) Host specific max. and min. intervals in adaptive scheduler

2024-03-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826060#comment-17826060 ] Hudson commented on NUTCH-3029: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #149 (See

[jira] [Commented] (NUTCH-3030) Use system default cipher suites instead of hard-coded set

2024-03-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826038#comment-17826038 ] Hudson commented on NUTCH-3030: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #148 (See

[jira] [Commented] (NUTCH-3031) ProtocolFactory host mapper to support domains

2024-03-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825760#comment-17825760 ] Hudson commented on NUTCH-3031: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #146 (See

[jira] [Commented] (NUTCH-3027) Trivial resource leak patch in DomainSuffixes.java

2024-01-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808626#comment-17808626 ] Hudson commented on NUTCH-3027: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #144 (See

[jira] [Commented] (NUTCH-3024) Remove flaky 'dependency check' target

2023-11-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789582#comment-17789582 ] Hudson commented on NUTCH-3024: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #143 (See

[jira] [Commented] (NUTCH-3025) urlfilter-fast to filter based on the length of the URL

2023-11-08 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784204#comment-17784204 ] Hudson commented on NUTCH-3025: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #142 (See

[jira] [Commented] (NUTCH-3017) Allow fast-urlfilter to load from HDFS/S3 and support gzipped input

2023-11-08 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784047#comment-17784047 ] Hudson commented on NUTCH-3017: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #141 (See

[jira] [Commented] (NUTCH-3020) ParseSegment should check for protocol's flags for truncation

2023-11-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783365#comment-17783365 ] Hudson commented on NUTCH-3020: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #140 (See

[jira] [Commented] (NUTCH-3019) Upgrade to Apache Tika 2.9.1

2023-11-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783327#comment-17783327 ] Hudson commented on NUTCH-3019: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #139 (See

[jira] [Commented] (NUTCH-3014) Standardize Job names

2023-11-02 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782403#comment-17782403 ] Hudson commented on NUTCH-3014: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #136 (See

[jira] [Commented] (NUTCH-3015) Add more CI steps to GitHub master-build.yml

2023-10-27 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780553#comment-17780553 ] Hudson commented on NUTCH-3015: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #135 (See

[jira] [Commented] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778206#comment-17778206 ] Hudson commented on NUTCH-3013: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #134 (See

[jira] [Commented] (NUTCH-3012) SegmentReader when dumping with option -recode: NPE on unparsed documents

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778135#comment-17778135 ] Hudson commented on NUTCH-3012: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #133 (See

[jira] [Commented] (NUTCH-3011) HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as server errors (HTTP 5xx)

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778136#comment-17778136 ] Hudson commented on NUTCH-3011: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #133 (See

[jira] [Commented] (NUTCH-2990) HttpRobotRulesParser to follow 5 redirects as specified by RFC 9309

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778123#comment-17778123 ] Hudson commented on NUTCH-2990: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #132 (See

[jira] [Commented] (NUTCH-3002) Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778124#comment-17778124 ] Hudson commented on NUTCH-3002: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #132 (See

[jira] [Commented] (NUTCH-3009) Upgrade to Hadoop 3.3.6

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778125#comment-17778125 ] Hudson commented on NUTCH-3009: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #132 (See

[jira] [Commented] (NUTCH-2959) Upgrade to Apache Tika 2.9.0

2023-10-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1939#comment-1939 ] Hudson commented on NUTCH-2959: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #131 (See

[jira] [Commented] (NUTCH-2853) bin/nutch: remove deprecated commands solrindex, solrdedup, solrclean

2023-10-03 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771427#comment-17771427 ] Hudson commented on NUTCH-2853: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #130 (See

[jira] [Commented] (NUTCH-2897) Do not supress deprecated API warnings

2023-10-03 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771426#comment-17771426 ] Hudson commented on NUTCH-2897: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #130 (See

[jira] [Commented] (NUTCH-3010) Injector: count unique number of injected URLs

2023-10-02 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771038#comment-17771038 ] Hudson commented on NUTCH-3010: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #129 (See

[jira] [Commented] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2023-09-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770786#comment-17770786 ] Hudson commented on NUTCH-2852: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #128 (See

[jira] [Commented] (NUTCH-3007) Fix impossible casts

2023-09-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770787#comment-17770787 ] Hudson commented on NUTCH-3007: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #128 (See

[jira] [Commented] (NUTCH-3004) Avoid NPE in HttpResponse

2023-09-26 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769168#comment-17769168 ] Hudson commented on NUTCH-3004: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #115 (See

[jira] [Commented] (NUTCH-2978) Move to slf4j2 and remove log4j1 and reload4j

2023-09-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766169#comment-17766169 ] Hudson commented on NUTCH-2978: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #112 (See

[jira] [Commented] (NUTCH-3000) protocol-selenium returns only the body,strips off the element

2023-09-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764803#comment-17764803 ] Hudson commented on NUTCH-3000: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #110 (See

[jira] [Commented] (NUTCH-3001) protocol-selenium requires Content-Type header

2023-09-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764802#comment-17764802 ] Hudson commented on NUTCH-3001: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #110 (See

[jira] [Commented] (NUTCH-2999) Update Lucene version to latest 8.x

2023-08-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760595#comment-17760595 ] Hudson commented on NUTCH-2999: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #108 (See

[jira] [Commented] (NUTCH-2999) Update Lucene version to latest 8.x

2023-08-30 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760574#comment-17760574 ] Hudson commented on NUTCH-2999: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #107 (See

[jira] [Commented] (NUTCH-2989) Can't have username/pw AND https on elastic-indexer?!

2023-08-28 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759641#comment-17759641 ] Hudson commented on NUTCH-2989: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #106 (See

[jira] [Commented] (NUTCH-2997) Add Override annotations where applicable

2023-08-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757333#comment-17757333 ] Hudson commented on NUTCH-2997: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #103 (See

[jira] [Commented] (NUTCH-2996) Use new SimpleRobotRulesParser API entry point (crawler-commons 1.4)

2023-08-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757332#comment-17757332 ] Hudson commented on NUTCH-2996: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #103 (See

[jira] [Commented] (NUTCH-2993) ScoringDepth plugin to skip depth check based on URL Pattern

2023-08-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757316#comment-17757316 ] Hudson commented on NUTCH-2993: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #102 (See

[jira] [Commented] (NUTCH-2995) Upgrade to crawler-commons 1.4

2023-08-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757315#comment-17757315 ] Hudson commented on NUTCH-2995: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #102 (See

[jira] [Commented] (NUTCH-2991) Support HTTP/S Header Authorization for Solr connections

2023-06-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735330#comment-17735330 ] Hudson commented on NUTCH-2991: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #101 (See

[jira] [Commented] (NUTCH-2991) Support HTTP/S Header Authorization for Solr connections

2023-06-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729750#comment-17729750 ] Hudson commented on NUTCH-2991: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #100 (See

[jira] [Commented] (NUTCH-2992) Fetcher: always block fetch queues when exceptions threshold is reached

2023-05-23 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725476#comment-17725476 ] Hudson commented on NUTCH-2992: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #98 (See

[jira] [Commented] (NUTCH-2596) Upgrade from org.mortbay.jetty to org.eclipse.jetty

2023-03-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17701895#comment-17701895 ] Hudson commented on NUTCH-2596: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #97 (See

[jira] [Commented] (NUTCH-2984) Drop test proxy server and benchmark tool

2023-03-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17701894#comment-17701894 ] Hudson commented on NUTCH-2984: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #97 (See

[jira] [Commented] (NUTCH-2972) Javadoc build fails using JDK 17

2023-03-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696898#comment-17696898 ] Hudson commented on NUTCH-2972: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #96 (See

[jira] [Commented] (NUTCH-2982) Generator: parameter for URL normalization not passed forward

2023-03-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696901#comment-17696901 ] Hudson commented on NUTCH-2982: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #96 (See

[jira] [Commented] (NUTCH-2985) Disable plugin urlfilter-validator by default

2023-03-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696900#comment-17696900 ] Hudson commented on NUTCH-2985: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #96 (See

[jira] [Commented] (NUTCH-2983) nutch-default.xml improvements

2023-03-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696899#comment-17696899 ] Hudson commented on NUTCH-2983: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #96 (See

[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin

2023-03-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696868#comment-17696868 ] Hudson commented on NUTCH-2920: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #95 (See

[jira] [Commented] (NUTCH-2980) Upgrade Selenium Java to 4.7.2

2023-02-18 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17690779#comment-17690779 ] Hudson commented on NUTCH-2980: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #94 (See

[jira] [Commented] (NUTCH-2974) Ant build fails with "Unparseable date" on certain platforms

2023-02-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17690457#comment-17690457 ] Hudson commented on NUTCH-2974: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #93 (See

[jira] [Commented] (NUTCH-2634) Some links marked as "nofollow" are followed anyway.

2023-01-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17679654#comment-17679654 ] Hudson commented on NUTCH-2634: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #92 (See

[jira] [Commented] (NUTCH-2634) Some links marked as "nofollow" are followed anyway.

2023-01-08 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655842#comment-17655842 ] Hudson commented on NUTCH-2634: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #91 (See

[jira] [Commented] (NUTCH-2924) Generate maxCount expr evaluated only once

2022-12-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646212#comment-17646212 ] Hudson commented on NUTCH-2924: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #90 (See

[jira] [Commented] (NUTCH-2977) Support for showing dependency tree

2022-12-07 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644524#comment-17644524 ] Hudson commented on NUTCH-2977: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #89 (See

[jira] [Commented] (NUTCH-2883) Provide means to run server as a persistent service in Docker container

2022-09-11 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17602851#comment-17602851 ] Hudson commented on NUTCH-2883: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #88 (See

[jira] [Commented] (NUTCH-2969) Javadoc: Javascript search is not working when built on JDK 11

2022-08-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583018#comment-17583018 ] Hudson commented on NUTCH-2969: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #86 (See

[jira] [Commented] (NUTCH-2843) Duplicate declaration of dependencies in ivy.xml

2022-08-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582538#comment-17582538 ] Hudson commented on NUTCH-2843: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #85 (See

[jira] [Commented] (NUTCH-2963) Upgrade dependencies before release of 1.19

2022-08-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582537#comment-17582537 ] Hudson commented on NUTCH-2963: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #85 (See

[jira] [Commented] (NUTCH-2795) CrawlDbReader: compress CrawlDb dumps if configured

2022-08-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582539#comment-17582539 ] Hudson commented on NUTCH-2795: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #85 (See

[jira] [Commented] (NUTCH-2863) Injector to parse command-line flags case-insensitive

2022-08-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582536#comment-17582536 ] Hudson commented on NUTCH-2863: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #85 (See

[jira] [Commented] (NUTCH-2962) Update and complete package info of protocol plugins

2022-08-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581898#comment-17581898 ] Hudson commented on NUTCH-2962: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #84 (See

[jira] [Commented] (NUTCH-2930) Protocol-okhttp: implement IP filter

2022-08-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581899#comment-17581899 ] Hudson commented on NUTCH-2930: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #84 (See

[jira] [Commented] (NUTCH-2822) Split the LICENSE.txt file into two files for source resp. binary releases

2022-08-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581860#comment-17581860 ] Hudson commented on NUTCH-2822: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #83 (See

[jira] [Commented] (NUTCH-2861) Remove parse-swf

2022-08-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581859#comment-17581859 ] Hudson commented on NUTCH-2861: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #83 (See

[jira] [Commented] (NUTCH-2290) Update licenses of bundled libraries

2022-08-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581861#comment-17581861 ] Hudson commented on NUTCH-2290: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #83 (See

[jira] [Commented] (NUTCH-2955) indexer-solr: replace deprecated/removed field type solr.LatLonType

2022-08-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580870#comment-17580870 ] Hudson commented on NUTCH-2955: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #82 (See

[jira] [Commented] (NUTCH-2957) indexer-solr / Solr schema: add fall-back field definitions for unknown index fields

2022-08-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580871#comment-17580871 ] Hudson commented on NUTCH-2957: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #82 (See

[jira] [Commented] (NUTCH-2896) Protocol-okhttp: make connection pool configurable

2022-08-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579780#comment-17579780 ] Hudson commented on NUTCH-2896: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #81 (See

[jira] [Commented] (NUTCH-2947) Fetcher: keep state of empty fetch queues unless queue feeder is finished

2022-08-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579781#comment-17579781 ] Hudson commented on NUTCH-2947: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #81 (See

[jira] [Commented] (NUTCH-2958) Upgrade to crawler-commons 1.3

2022-08-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579123#comment-17579123 ] Hudson commented on NUTCH-2958: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #80 (See

[jira] [Commented] (NUTCH-2956) index-geoip: dependency upgrades and improvements

2022-08-09 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577334#comment-17577334 ] Hudson commented on NUTCH-2956: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #79 (See

[jira] [Commented] (NUTCH-2952) Upgrade core dependencies (Hadoop 3.3.3, log4j 2.17.2)

2022-08-09 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577286#comment-17577286 ] Hudson commented on NUTCH-2952: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #78 (See

[jira] [Commented] (NUTCH-2936) Early registration of URL stream handlers provided by plugins may fail Hadoop jobs running in distributed mode if protocol-okhttp is used

2022-08-09 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577288#comment-17577288 ] Hudson commented on NUTCH-2936: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #78 (See

[jira] [Commented] (NUTCH-2953) Indexer Elastic to ignore SSL issues

2022-08-09 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577287#comment-17577287 ] Hudson commented on NUTCH-2953: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #78 (See

[jira] [Commented] (NUTCH-2951) Crawl datum with metadata WRITABLE_GENERATE_TIME_KEY awaits fetching forever

2022-06-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17556853#comment-17556853 ] Hudson commented on NUTCH-2951: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #77 (See

[jira] [Commented] (NUTCH-2950) UpdateHostDb: performance improvements

2022-05-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541483#comment-17541483 ] Hudson commented on NUTCH-2950: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #76 (See

[jira] [Commented] (NUTCH-2936) Early registration of URL stream handlers provided by plugins may fail Hadoop jobs running in distributed mode

2022-05-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17540255#comment-17540255 ] Hudson commented on NUTCH-2936: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #75 (See

[jira] [Commented] (NUTCH-2946) Fetcher: optionally slow down fetching from hosts with repeated exceptions

2022-05-19 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539600#comment-17539600 ] Hudson commented on NUTCH-2946: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #74 (See

[jira] [Commented] (NUTCH-2948) Upgrade dependencies to Any23 2.7 and Tika 2.3.0

2022-05-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536196#comment-17536196 ] Hudson commented on NUTCH-2948: --- FAILURE: Integrated in Jenkins build Nutch » Nutch-trunk #73 (See

[jira] [Commented] (NUTCH-2923) Add Job Id in Job Failure messages

2022-01-27 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483262#comment-17483262 ] Hudson commented on NUTCH-2923: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #72 (See

[jira] [Commented] (NUTCH-2573) Suspend crawling if robots.txt fails to fetch with 5xx status

2022-01-18 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477667#comment-17477667 ] Hudson commented on NUTCH-2573: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #71 (See

[jira] [Commented] (NUTCH-2935) DeduplicationJob: failure on URLs with invalid percent encoding

2022-01-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477378#comment-17477378 ] Hudson commented on NUTCH-2935: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #70 (See

[jira] [Commented] (NUTCH-2919) NUTCH-2919 Upgrade to Tika 2.2.1 and Any23 2.6

2022-01-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476720#comment-17476720 ] Hudson commented on NUTCH-2919: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #69 (See

[jira] [Commented] (NUTCH-2929) Fetcher: start threads slowly to avoid that resources are temporarily exhausted

2022-01-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476061#comment-17476061 ] Hudson commented on NUTCH-2929: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #68 (See

[jira] [Commented] (NUTCH-2903) Unable to Connect to Elasticsearch over HTTPS

2022-01-09 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17471326#comment-17471326 ] Hudson commented on NUTCH-2903: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #67 (See

[jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers

2022-01-07 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17470996#comment-17470996 ] Hudson commented on NUTCH-2429: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #66 (See

[jira] [Commented] (NUTCH-2917) Remove transitive dependency to log4j 1.x

2021-12-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463707#comment-17463707 ] Hudson commented on NUTCH-2917: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #65 (See

[jira] [Commented] (NUTCH-2449) Usage of Tika LanguageIdentifier in language-identifier plugin

2021-12-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461770#comment-17461770 ] Hudson commented on NUTCH-2449: --- ABORTED: Integrated in Jenkins build Nutch » Nutch-trunk #63 (See

  1   2   3   4   5   6   7   8   9   10   >