[jira] [Commented] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2023-09-30 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770786#comment-17770786
 ] 

Hudson commented on NUTCH-2852:
---

SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #128 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/128/])
NUTCH-2852 SpotBugs: Method invokes System.exit(...) (snagel: 
[https://github.com/apache/nutch/commit/417b8773231136eb48957f743c2bc3c21f624d4e])
* (edit) src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
* (edit) src/java/org/apache/nutch/parse/ParserChecker.java
* (edit) src/java/org/apache/nutch/util/AbstractChecker.java
* (edit) src/java/org/apache/nutch/net/URLNormalizerChecker.java
* (edit) src/java/org/apache/nutch/net/URLFilterChecker.java


> Method invokes System.exit(...) 9 bugs
> --
>
> Key: NUTCH-2852
> URL: https://issues.apache.org/jira/browse/NUTCH-2852
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.18
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Major
> Fix For: 1.20
>
>
> org.apache.nutch.indexer.IndexingFiltersChecker since first historized release
> In class org.apache.nutch.indexer.IndexingFiltersChecker
> In method org.apache.nutch.indexer.IndexingFiltersChecker.run(String[])
> At IndexingFiltersChecker.java:[line 96]
> Another occurrence at IndexingFiltersChecker.java:[line 129]
> org.apache.nutch.indexer.IndexingFiltersChecker.run(String[]) invokes 
> System.exit(...), which shuts down the entire virtual machine
> Invoking System.exit shuts down the entire Java virtual machine. This should 
> only been done when it is appropriate. Such calls make it hard or impossible 
> for your code to be invoked by other code. Consider throwing a 
> RuntimeException instead.
> Also occurs in
>org.apache.nutch.net.URLFilterChecker since first historized release
>org.apache.nutch.net.URLNormalizerChecker since first historized release
>org.apache.nutch.parse.ParseSegment since first historized release
>org.apache.nutch.parse.ParserChecker since first historized release
>org.apache.nutch.service.NutchServer since first historized release
>org.apache.nutch.tools.CommonCrawlDataDumper since first historized release
>org.apache.nutch.tools.DmozParser since first historized release
>org.apache.nutch.util.AbstractChecker since first historized release 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NUTCH-3007) Fix impossible casts

2023-09-30 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770787#comment-17770787
 ] 

Hudson commented on NUTCH-3007:
---

SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #128 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/128/])
NUTCH-3007 Fix impossible casts (snagel: 
[https://github.com/apache/nutch/commit/a72a53a32d2183f8a8baefbd50afd007279e4857])
* (edit) src/java/org/apache/nutch/fetcher/Fetcher.java
* (edit) src/java/org/apache/nutch/parse/ParseSegment.java


> Fix impossible casts
> 
>
> Key: NUTCH-3007
> URL: https://issues.apache.org/jira/browse/NUTCH-3007
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.19
>Reporter: Sebastian Nagel
>Assignee: Sebastian Nagel
>Priority: Major
> Fix For: 1.20
>
>
> Spotbugs reports two occurrences of
>   Impossible cast from java.util.ArrayList to String[] in 
> org.apache.nutch.fetcher.Fetcher.run(Map, String)
> Both were introduced later into the {{run(Map args, String 
> crawlId)}} method and obviously never used (would throw a 
> ClassCastException). The code blocks should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is back to normal : Nutch » Nutch-trunk #128

2023-09-30 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Nutch » Nutch-trunk #127

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2827)remote: Counting objects:   1% 
(29/2827)remote: Counting objects:   2% (57/2827)remote: 
Counting objects:   3% (85/2827)remote: Counting objects:   4% 
(114/2827)remote: Counting objects:   5% (142/2827)remote: 
Counting objects:   6% (170/2827)remote: Counting objects:   7% 
(198/2827)remote: Counting objects:   8% (227/2827)remote: 
Counting objects:   9% (255/2827)remote: Counting objects:  10% 
(283/2827)remote: Counting objects:  11% (311/2827)remote: 
Counting objects:  12% (340/2827)remote: Counting objects:  13% 
(368/2827)remote: Counting objects:  14% (396/2827)remote: 
Counting objects:  15% (425/2827)remote: Counting objects:  16% 
(453/2827)remote: Counting objects:  17% (481/2827)remote: 
Counting objects:  18% (509/2827)remote: Counting objects:  19% 
(538/2827)remote: Counting objects:  20% (566/2827)remote: 
Counting objects:  21% (594/2827)remote: Counting objects:  22% 
(622/2827)remote: Counting objects:  23% (651/2827)remote: 
Counting objects:  24% (679/2827)remote: Counting objects:  25% 
(707/2827)remote: Counting objects:  26% (736/2827)remote: 
Counting objects:  27% (764/2827)remote: Counting objects:  28% 
(792/2827)remote: Counting objects:  29% (820/2827)remote: 
Counting objects:  30% (849/2827)remote: Counting objects:  31% 
(877/2827)remote: Counting objects:  32% (905/2827)remote: 
Counting objects:  33% (933/2827)remote: Counting objects:  34% 
(962/2827)remote: Counting objects:  35% (990/2827)remote: 
Counting objects:  36% (1018/2827)remote: Counting objects:  37% 
(1046/2827)remote: Counting objects:  38% (1075/2827)remote: 
Counting objects:  39% (1103/2827)remote: Counting objects:  40% 
(1131/2827)remote: Counting objects:  41% (1160/2827)remote: 
Counting objects:  42% (1188/2827)remote: Counting objects:  43% 
(1216/2827)remote: Counting objects:  44% (1244/2827)remote: 
Counting objects:  45% (1273/2827)remote: Counting objects:  46% 
(1301/2827)remote: Counting objects:  47% (1329/2827)remote: 
Counting objects:  48% (1357/2827)remote: Counting objects:  49% 
(1386/2827)remote: Counting objects:  50% (1414/2827)remote: 
Counting objects:  51% (1442/2827)remote: Counting objects:  52% 
(1471/2827)remote: Counting objects:  53% (1499/2827)remote: 
Counting objects:  54% (1527/2827)remote: Counting objects:  55% 
(1555/2827)remote: Counting objects:  56% (1584/2827)

Build failed in Jenkins: Nutch » Nutch-trunk #126

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2827)remote: Counting objects:   1% 
(29/2827)remote: Counting objects:   2% (57/2827)remote: 
Counting objects:   3% (85/2827)remote: Counting objects:   4% 
(114/2827)remote: Counting objects:   5% (142/2827)remote: 
Counting objects:   6% (170/2827)remote: Counting objects:   7% 
(198/2827)remote: Counting objects:   8% (227/2827)remote: 
Counting objects:   9% (255/2827)remote: Counting objects:  10% 
(283/2827)remote: Counting objects:  11% (311/2827)remote: 
Counting objects:  12% (340/2827)remote: Counting objects:  13% 
(368/2827)remote: Counting objects:  14% (396/2827)remote: 
Counting objects:  15% (425/2827)remote: Counting objects:  16% 
(453/2827)remote: Counting objects:  17% (481/2827)remote: 
Counting objects:  18% (509/2827)remote: Counting objects:  19% 
(538/2827)remote: Counting objects:  20% (566/2827)remote: 
Counting objects:  21% (594/2827)remote: Counting objects:  22% 
(622/2827)remote: Counting objects:  23% (651/2827)remote: 
Counting objects:  24% (679/2827)remote: Counting objects:  25% 
(707/2827)remote: Counting objects:  26% (736/2827)remote: 
Counting objects:  27% (764/2827)remote: Counting objects:  28% 
(792/2827)remote: Counting objects:  29% (820/2827)remote: 
Counting objects:  30% (849/2827)remote: Counting objects:  31% 
(877/2827)remote: Counting objects:  32% (905/2827)remote: 
Counting objects:  33% (933/2827)remote: Counting objects:  34% 
(962/2827)remote: Counting objects:  35% (990/2827)remote: 
Counting objects:  36% (1018/2827)remote: Counting objects:  37% 
(1046/2827)remote: Counting objects:  38% (1075/2827)remote: 
Counting objects:  39% (1103/2827)remote: Counting objects:  40% 
(1131/2827)remote: Counting objects:  41% (1160/2827)remote: 
Counting objects:  42% (1188/2827)remote: Counting objects:  43% 
(1216/2827)remote: Counting objects:  44% (1244/2827)remote: 
Counting objects:  45% (1273/2827)remote: Counting objects:  46% 
(1301/2827)remote: Counting objects:  47% (1329/2827)remote: 
Counting objects:  48% (1357/2827)remote: Counting objects:  49% 
(1386/2827)remote: Counting objects:  50% (1414/2827)remote: 
Counting objects:  51% (1442/2827)remote: Counting objects:  52% 
(1471/2827)remote: Counting objects:  53% (1499/2827)remote: 
Counting objects:  54% (1527/2827)remote: Counting objects:  55% 
(1555/2827)remote: Counting objects:  56% (1584/2827)

Build failed in Jenkins: Nutch » Nutch-trunk #125

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2827)remote: Counting objects:   1% 
(29/2827)remote: Counting objects:   2% (57/2827)remote: 
Counting objects:   3% (85/2827)remote: Counting objects:   4% 
(114/2827)remote: Counting objects:   5% (142/2827)remote: 
Counting objects:   6% (170/2827)remote: Counting objects:   7% 
(198/2827)remote: Counting objects:   8% (227/2827)remote: 
Counting objects:   9% (255/2827)remote: Counting objects:  10% 
(283/2827)remote: Counting objects:  11% (311/2827)remote: 
Counting objects:  12% (340/2827)remote: Counting objects:  13% 
(368/2827)remote: Counting objects:  14% (396/2827)remote: 
Counting objects:  15% (425/2827)remote: Counting objects:  16% 
(453/2827)remote: Counting objects:  17% (481/2827)remote: 
Counting objects:  18% (509/2827)remote: Counting objects:  19% 
(538/2827)remote: Counting objects:  20% (566/2827)remote: 
Counting objects:  21% (594/2827)remote: Counting objects:  22% 
(622/2827)remote: Counting objects:  23% (651/2827)remote: 
Counting objects:  24% (679/2827)remote: Counting objects:  25% 
(707/2827)remote: Counting objects:  26% (736/2827)remote: 
Counting objects:  27% (764/2827)remote: Counting objects:  28% 
(792/2827)remote: Counting objects:  29% (820/2827)remote: 
Counting objects:  30% (849/2827)remote: Counting objects:  31% 
(877/2827)remote: Counting objects:  32% (905/2827)remote: 
Counting objects:  33% (933/2827)remote: Counting objects:  34% 
(962/2827)remote: Counting objects:  35% (990/2827)remote: 
Counting objects:  36% (1018/2827)remote: Counting objects:  37% 
(1046/2827)remote: Counting objects:  38% (1075/2827)remote: 
Counting objects:  39% (1103/2827)remote: Counting objects:  40% 
(1131/2827)remote: Counting objects:  41% (1160/2827)remote: 
Counting objects:  42% (1188/2827)remote: Counting objects:  43% 
(1216/2827)remote: Counting objects:  44% (1244/2827)remote: 
Counting objects:  45% (1273/2827)remote: Counting objects:  46% 
(1301/2827)remote: Counting objects:  47% (1329/2827)remote: 
Counting objects:  48% (1357/2827)remote: Counting objects:  49% 
(1386/2827)remote: Counting objects:  50% (1414/2827)remote: 
Counting objects:  51% (1442/2827)remote: Counting objects:  52% 
(1471/2827)remote: Counting objects:  53% (1499/2827)remote: 
Counting objects:  54% (1527/2827)remote: Counting objects:  55% 
(1555/2827)remote: Counting objects:  56% (1584/2827)

Build failed in Jenkins: Nutch » Nutch-trunk #124

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

Build failed in Jenkins: Nutch » Nutch-trunk #123

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

Build failed in Jenkins: Nutch » Nutch-trunk #122

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

Build failed in Jenkins: Nutch » Nutch-trunk #121

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2827)remote: Counting objects:   1% 
(29/2827)remote: Counting objects:   2% (57/2827)remote: 
Counting objects:   3% (85/2827)remote: Counting objects:   4% 
(114/2827)remote: Counting objects:   5% (142/2827)remote: 
Counting objects:   6% (170/2827)remote: Counting objects:   7% 
(198/2827)remote: Counting objects:   8% (227/2827)remote: 
Counting objects:   9% (255/2827)remote: Counting objects:  10% 
(283/2827)remote: Counting objects:  11% (311/2827)remote: 
Counting objects:  12% (340/2827)remote: Counting objects:  13% 
(368/2827)remote: Counting objects:  14% (396/2827)remote: 
Counting objects:  15% (425/2827)remote: Counting objects:  16% 
(453/2827)remote: Counting objects:  17% (481/2827)remote: 
Counting objects:  18% (509/2827)remote: Counting objects:  19% 
(538/2827)remote: Counting objects:  20% (566/2827)remote: 
Counting objects:  21% (594/2827)remote: Counting objects:  22% 
(622/2827)remote: Counting objects:  23% (651/2827)remote: 
Counting objects:  24% (679/2827)remote: Counting objects:  25% 
(707/2827)remote: Counting objects:  26% (736/2827)remote: 
Counting objects:  27% (764/2827)remote: Counting objects:  28% 
(792/2827)remote: Counting objects:  29% (820/2827)remote: 
Counting objects:  30% (849/2827)remote: Counting objects:  31% 
(877/2827)remote: Counting objects:  32% (905/2827)remote: 
Counting objects:  33% (933/2827)remote: Counting objects:  34% 
(962/2827)remote: Counting objects:  35% (990/2827)remote: 
Counting objects:  36% (1018/2827)remote: Counting objects:  37% 
(1046/2827)remote: Counting objects:  38% (1075/2827)remote: 
Counting objects:  39% (1103/2827)remote: Counting objects:  40% 
(1131/2827)remote: Counting objects:  41% (1160/2827)remote: 
Counting objects:  42% (1188/2827)remote: Counting objects:  43% 
(1216/2827)remote: Counting objects:  44% (1244/2827)remote: 
Counting objects:  45% (1273/2827)remote: Counting objects:  46% 
(1301/2827)remote: Counting objects:  47% (1329/2827)remote: 
Counting objects:  48% (1357/2827)remote: Counting objects:  49% 
(1386/2827)remote: Counting objects:  50% (1414/2827)remote: 
Counting objects:  51% (1442/2827)remote: Counting objects:  52% 
(1471/2827)remote: Counting objects:  53% (1499/2827)remote: 
Counting objects:  54% (1527/2827)remote: Counting objects:  55% 
(1555/2827)remote: Counting objects:  56% (1584/2827)

Build failed in Jenkins: Nutch » Nutch-trunk #120

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2827)remote: Counting objects:   1% 
(29/2827)remote: Counting objects:   2% (57/2827)remote: 
Counting objects:   3% (85/2827)remote: Counting objects:   4% 
(114/2827)remote: Counting objects:   5% (142/2827)remote: 
Counting objects:   6% (170/2827)remote: Counting objects:   7% 
(198/2827)remote: Counting objects:   8% (227/2827)remote: 
Counting objects:   9% (255/2827)remote: Counting objects:  10% 
(283/2827)remote: Counting objects:  11% (311/2827)remote: 
Counting objects:  12% (340/2827)remote: Counting objects:  13% 
(368/2827)remote: Counting objects:  14% (396/2827)remote: 
Counting objects:  15% (425/2827)remote: Counting objects:  16% 
(453/2827)remote: Counting objects:  17% (481/2827)remote: 
Counting objects:  18% (509/2827)remote: Counting objects:  19% 
(538/2827)remote: Counting objects:  20% (566/2827)remote: 
Counting objects:  21% (594/2827)remote: Counting objects:  22% 
(622/2827)remote: Counting objects:  23% (651/2827)remote: 
Counting objects:  24% (679/2827)remote: Counting objects:  25% 
(707/2827)remote: Counting objects:  26% (736/2827)remote: 
Counting objects:  27% (764/2827)remote: Counting objects:  28% 
(792/2827)remote: Counting objects:  29% (820/2827)remote: 
Counting objects:  30% (849/2827)remote: Counting objects:  31% 
(877/2827)remote: Counting objects:  32% (905/2827)remote: 
Counting objects:  33% (933/2827)remote: Counting objects:  34% 
(962/2827)remote: Counting objects:  35% (990/2827)remote: 
Counting objects:  36% (1018/2827)remote: Counting objects:  37% 
(1046/2827)remote: Counting objects:  38% (1075/2827)remote: 
Counting objects:  39% (1103/2827)remote: Counting objects:  40% 
(1131/2827)remote: Counting objects:  41% (1160/2827)remote: 
Counting objects:  42% (1188/2827)remote: Counting objects:  43% 
(1216/2827)remote: Counting objects:  44% (1244/2827)remote: 
Counting objects:  45% (1273/2827)remote: Counting objects:  46% 
(1301/2827)remote: Counting objects:  47% (1329/2827)remote: 
Counting objects:  48% (1357/2827)remote: Counting objects:  49% 
(1386/2827)remote: Counting objects:  50% (1414/2827)remote: 
Counting objects:  51% (1442/2827)remote: Counting objects:  52% 
(1471/2827)remote: Counting objects:  53% (1499/2827)remote: 
Counting objects:  54% (1527/2827)remote: Counting objects:  55% 
(1555/2827)remote: Counting objects:  56% (1584/2827)

Build failed in Jenkins: Nutch » Nutch-trunk #119

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

Build failed in Jenkins: Nutch » Nutch-trunk #118

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

Build failed in Jenkins: Nutch » Nutch-trunk #117

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

[jira] [Resolved] (NUTCH-2820) Review sample files used in any23 unit tests

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-2820.

Resolution: Resolved

Resolved with the removal of the any23 plugin (NUTCH-2998).

> Review sample files used in any23 unit tests
> 
>
> Key: NUTCH-2820
> URL: https://issues.apache.org/jira/browse/NUTCH-2820
> Project: Nutch
>  Issue Type: Bug
>  Components: plugin
>Affects Versions: 1.17
>Reporter: Sebastian Nagel
>Priority: Minor
> Fix For: 1.20
>
>
> The sample files used by unit tests of the any23 plugin include content not 
> applicable to the Apache license. These should removed or stripped to a 
> minimal snippet (mostly HTML markup).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NUTCH-2888) Selenium Protocol: Support for Selenium 4

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-2888.

Resolution: Duplicate

Thanks, [~mmkivist]! This issue was resolved by NUTCH-2980 and will be included 
in the 1.20 release of Nutch.

> Selenium Protocol: Support for Selenium 4
> -
>
> Key: NUTCH-2888
> URL: https://issues.apache.org/jira/browse/NUTCH-2888
> Project: Nutch
>  Issue Type: New Feature
>  Components: protocol
>Affects Versions: 1.18
>Reporter: Mikko Kivistoe
>Priority: Minor
> Fix For: 1.20
>
>
> Hi,
> Selenium 4 is out and it's Grid version supports now HTTPS traffic between 
> the Hub and Nodes. The Selenium 4 api has changed, and it would be good to 
> have Nutch compatible with it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NUTCH-2888) Selenium Protocol: Support for Selenium 4

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-2888:
---
Affects Version/s: 1.18

> Selenium Protocol: Support for Selenium 4
> -
>
> Key: NUTCH-2888
> URL: https://issues.apache.org/jira/browse/NUTCH-2888
> Project: Nutch
>  Issue Type: New Feature
>  Components: protocol
>Affects Versions: 1.18
>Reporter: Mikko Kivistoe
>Priority: Minor
> Fix For: 1.20
>
>
> Hi,
> Selenium 4 is out and it's Grid version supports now HTTPS traffic between 
> the Hub and Nodes. The Selenium 4 api has changed, and it would be good to 
> have Nutch compatible with it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NUTCH-2888) Selenium Protocol: Support for Selenium 4

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-2888:
---
Fix Version/s: 1.20

> Selenium Protocol: Support for Selenium 4
> -
>
> Key: NUTCH-2888
> URL: https://issues.apache.org/jira/browse/NUTCH-2888
> Project: Nutch
>  Issue Type: New Feature
>  Components: protocol
>Reporter: Mikko Kivistoe
>Priority: Minor
> Fix For: 1.20
>
>
> Hi,
> Selenium 4 is out and it's Grid version supports now HTTPS traffic between 
> the Hub and Nodes. The Selenium 4 api has changed, and it would be good to 
> have Nutch compatible with it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [nutch] sebastian-nagel opened a new pull request, #785: NUTCH-2853 bin/nutch: remove deprecated commands solrindex, solrdedup, solrclean

2023-09-30 Thread via GitHub


sebastian-nagel opened a new pull request, #785:
URL: https://github.com/apache/nutch/pull/785

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (NUTCH-2853) bin/nutch: remove deprecated commands solrindex, solrdedup, solrclean

2023-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770676#comment-17770676
 ] 

ASF GitHub Bot commented on NUTCH-2853:
---

sebastian-nagel opened a new pull request, #785:
URL: https://github.com/apache/nutch/pull/785

   (no comment)




> bin/nutch: remove deprecated commands solrindex, solrdedup, solrclean
> -
>
> Key: NUTCH-2853
> URL: https://issues.apache.org/jira/browse/NUTCH-2853
> Project: Nutch
>  Issue Type: Improvement
>  Components: bin
>Affects Versions: 1.18
>Reporter: Sebastian Nagel
>Priority: Major
>  Labels: help-wanted
> Fix For: 1.20
>
>
> The commands "solrindex", "solrdedup" and "solrclean" are deprecated since 7 
> years and should be removed to avoid any confusions (one example: 
> https://stackoverflow.com/questions/66376609/nutch-solr-index-is-failing).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NUTCH-2897) Do not supress deprecated API warnings

2023-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770675#comment-17770675
 ] 

ASF GitHub Bot commented on NUTCH-2897:
---

sebastian-nagel opened a new pull request, #784:
URL: https://github.com/apache/nutch/pull/784

   - deprecate constructor of NutchJob
   - remove deprocated call to Object.finalize() from Plugin.finalize()




> Do not supress deprecated API warnings
> --
>
> Key: NUTCH-2897
> URL: https://issues.apache.org/jira/browse/NUTCH-2897
> Project: Nutch
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.18
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Major
> Fix For: 1.20
>
>
> We suppress deprecated warnings in three places
> # 
> [Plugin.java#L92-L96|https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/plugin/Plugin.java#L92-L96]
> # 
> [NutchJob.java#L35-L38|https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/util/NutchJob.java#L35-L38],
>  and
> # 
> [TikaParser.java#L92-L95|https://github.com/apache/nutch/blob/master/src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaParser.java#L92-L95]
> Instead of suppressing the warnings we should instead use the correct 
> *@Deprecated* annotation and *@deprecated* Javadoc. This is not difficult to 
> do and should have been done first time around.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [nutch] sebastian-nagel opened a new pull request, #784: NUTCH-2897 Do not supress deprecated API warnings

2023-09-30 Thread via GitHub


sebastian-nagel opened a new pull request, #784:
URL: https://github.com/apache/nutch/pull/784

   - deprecate constructor of NutchJob
   - remove deprocated call to Object.finalize() from Plugin.finalize()


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Build failed in Jenkins: Nutch » Nutch-trunk #116

2023-09-30 Thread Apache Jenkins Server
See 


Changes:


--
Started by an SCM change
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building remotely on builds58 (ubuntu) in workspace 

The recommended git tool is: NONE
No credentials specified
 > git rev-parse --resolve-git-dir 
 >  # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/nutch.git # timeout=10
Fetching upstream changes from https://github.com/apache/nutch.git
 > git --version # timeout=10
 > git --version # 'git version 2.17.1'
 > git fetch --tags --progress -- https://github.com/apache/nutch.git 
 > +refs/heads/*:refs/remotes/origin/* # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/nutch.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:1003)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1245)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1309)
at hudson.scm.SCM.checkout(SCM.java:540)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1240)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
at hudson.model.Run.execute(Run.java:1900)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
at hudson.model.ResourceController.execute(ResourceController.java:101)
at hudson.model.Executor.run(Executor.java:442)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress -- https://github.com/apache/nutch.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Enumerating objects: 12068, done.
remote: Counting objects:   0% (1/2829)remote: Counting objects:   1% 
(29/2829)remote: Counting objects:   2% (57/2829)remote: 
Counting objects:   3% (85/2829)remote: Counting objects:   4% 
(114/2829)remote: Counting objects:   5% (142/2829)remote: 
Counting objects:   6% (170/2829)remote: Counting objects:   7% 
(199/2829)remote: Counting objects:   8% (227/2829)remote: 
Counting objects:   9% (255/2829)remote: Counting objects:  10% 
(283/2829)remote: Counting objects:  11% (312/2829)remote: 
Counting objects:  12% (340/2829)remote: Counting objects:  13% 
(368/2829)remote: Counting objects:  14% (397/2829)remote: 
Counting objects:  15% (425/2829)remote: Counting objects:  16% 
(453/2829)remote: Counting objects:  17% (481/2829)remote: 
Counting objects:  18% (510/2829)remote: Counting objects:  19% 
(538/2829)remote: Counting objects:  20% (566/2829)remote: 
Counting objects:  21% (595/2829)remote: Counting objects:  22% 
(623/2829)remote: Counting objects:  23% (651/2829)remote: 
Counting objects:  24% (679/2829)remote: Counting objects:  25% 
(708/2829)remote: Counting objects:  26% (736/2829)remote: 
Counting objects:  27% (764/2829)remote: Counting objects:  28% 
(793/2829)remote: Counting objects:  29% (821/2829)remote: 
Counting objects:  30% (849/2829)remote: Counting objects:  31% 
(877/2829)remote: Counting objects:  32% (906/2829)remote: 
Counting objects:  33% (934/2829)remote: Counting objects:  34% 
(962/2829)remote: Counting objects:  35% (991/2829)remote: 
Counting objects:  36% (1019/2829)remote: Counting objects:  37% 
(1047/2829)remote: Counting objects:  38% (1076/2829)remote: 
Counting objects:  39% (1104/2829)remote: Counting objects:  40% 
(1132/2829)remote: Counting objects:  41% (1160/2829)remote: 
Counting objects:  42% (1189/2829)remote: Counting objects:  43% 
(1217/2829)remote: Counting objects:  44% (1245/2829)remote: 
Counting objects:  45% (1274/2829)remote: Counting objects:  46% 
(1302/2829)remote: Counting objects:  47% (1330/2829)remote: 
Counting objects:  48% (1358/2829)remote: Counting objects:  49% 
(1387/2829)remote: Counting objects:  50% (1415/2829)remote: 
Counting objects:  51% (1443/2829)remote: Counting objects:  52% 
(1472/2829)remote: Counting objects:  53% (1500/2829)remote: 
Counting objects:  54% (1528/2829)remote: Counting objects:  55% 
(1556/2829)remote: Counting objects:  56% (1585/2829)

[jira] [Resolved] (NUTCH-3007) Fix impossible casts

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-3007.

Resolution: Fixed

Thanks for the review, [~markus17]!

> Fix impossible casts
> 
>
> Key: NUTCH-3007
> URL: https://issues.apache.org/jira/browse/NUTCH-3007
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.19
>Reporter: Sebastian Nagel
>Assignee: Sebastian Nagel
>Priority: Major
> Fix For: 1.20
>
>
> Spotbugs reports two occurrences of
>   Impossible cast from java.util.ArrayList to String[] in 
> org.apache.nutch.fetcher.Fetcher.run(Map, String)
> Both were introduced later into the {{run(Map args, String 
> crawlId)}} method and obviously never used (would throw a 
> ClassCastException). The code blocks should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-2852.

Resolution: Fixed

> Method invokes System.exit(...) 9 bugs
> --
>
> Key: NUTCH-2852
> URL: https://issues.apache.org/jira/browse/NUTCH-2852
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.18
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Major
> Fix For: 1.20
>
>
> org.apache.nutch.indexer.IndexingFiltersChecker since first historized release
> In class org.apache.nutch.indexer.IndexingFiltersChecker
> In method org.apache.nutch.indexer.IndexingFiltersChecker.run(String[])
> At IndexingFiltersChecker.java:[line 96]
> Another occurrence at IndexingFiltersChecker.java:[line 129]
> org.apache.nutch.indexer.IndexingFiltersChecker.run(String[]) invokes 
> System.exit(...), which shuts down the entire virtual machine
> Invoking System.exit shuts down the entire Java virtual machine. This should 
> only been done when it is appropriate. Such calls make it hard or impossible 
> for your code to be invoked by other code. Consider throwing a 
> RuntimeException instead.
> Also occurs in
>org.apache.nutch.net.URLFilterChecker since first historized release
>org.apache.nutch.net.URLNormalizerChecker since first historized release
>org.apache.nutch.parse.ParseSegment since first historized release
>org.apache.nutch.parse.ParserChecker since first historized release
>org.apache.nutch.service.NutchServer since first historized release
>org.apache.nutch.tools.CommonCrawlDataDumper since first historized release
>org.apache.nutch.tools.DmozParser since first historized release
>org.apache.nutch.util.AbstractChecker since first historized release 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NUTCH-3007) Fix impossible casts

2023-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770669#comment-17770669
 ] 

ASF GitHub Bot commented on NUTCH-3007:
---

sebastian-nagel merged PR #781:
URL: https://github.com/apache/nutch/pull/781




> Fix impossible casts
> 
>
> Key: NUTCH-3007
> URL: https://issues.apache.org/jira/browse/NUTCH-3007
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.19
>Reporter: Sebastian Nagel
>Assignee: Sebastian Nagel
>Priority: Major
> Fix For: 1.20
>
>
> Spotbugs reports two occurrences of
>   Impossible cast from java.util.ArrayList to String[] in 
> org.apache.nutch.fetcher.Fetcher.run(Map, String)
> Both were introduced later into the {{run(Map args, String 
> crawlId)}} method and obviously never used (would throw a 
> ClassCastException). The code blocks should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2023-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770668#comment-17770668
 ] 

ASF GitHub Bot commented on NUTCH-2852:
---

sebastian-nagel merged PR #780:
URL: https://github.com/apache/nutch/pull/780




> Method invokes System.exit(...) 9 bugs
> --
>
> Key: NUTCH-2852
> URL: https://issues.apache.org/jira/browse/NUTCH-2852
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.18
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Major
> Fix For: 1.20
>
>
> org.apache.nutch.indexer.IndexingFiltersChecker since first historized release
> In class org.apache.nutch.indexer.IndexingFiltersChecker
> In method org.apache.nutch.indexer.IndexingFiltersChecker.run(String[])
> At IndexingFiltersChecker.java:[line 96]
> Another occurrence at IndexingFiltersChecker.java:[line 129]
> org.apache.nutch.indexer.IndexingFiltersChecker.run(String[]) invokes 
> System.exit(...), which shuts down the entire virtual machine
> Invoking System.exit shuts down the entire Java virtual machine. This should 
> only been done when it is appropriate. Such calls make it hard or impossible 
> for your code to be invoked by other code. Consider throwing a 
> RuntimeException instead.
> Also occurs in
>org.apache.nutch.net.URLFilterChecker since first historized release
>org.apache.nutch.net.URLNormalizerChecker since first historized release
>org.apache.nutch.parse.ParseSegment since first historized release
>org.apache.nutch.parse.ParserChecker since first historized release
>org.apache.nutch.service.NutchServer since first historized release
>org.apache.nutch.tools.CommonCrawlDataDumper since first historized release
>org.apache.nutch.tools.DmozParser since first historized release
>org.apache.nutch.util.AbstractChecker since first historized release 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [nutch] sebastian-nagel merged pull request #781: NUTCH-3007 Fix impossible casts

2023-09-30 Thread via GitHub


sebastian-nagel merged PR #781:
URL: https://github.com/apache/nutch/pull/781


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [nutch] sebastian-nagel merged pull request #780: NUTCH-2852 SpotBugs: Method invokes System.exit(...)

2023-09-30 Thread via GitHub


sebastian-nagel merged PR #780:
URL: https://github.com/apache/nutch/pull/780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (NUTCH-3010) Injector: count unique number of injected URLs

2023-09-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/NUTCH-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770656#comment-17770656
 ] 

ASF GitHub Bot commented on NUTCH-3010:
---

sebastian-nagel opened a new pull request, #783:
URL: https://github.com/apache/nutch/pull/783

   - add counter urls_injected_unique
   - improve log messages reporting the counts of injected/merged URLs




> Injector: count unique number of injected URLs
> --
>
> Key: NUTCH-3010
> URL: https://issues.apache.org/jira/browse/NUTCH-3010
> Project: Nutch
>  Issue Type: Improvement
>  Components: injector
>Affects Versions: 1.19
>Reporter: Sebastian Nagel
>Assignee: Sebastian Nagel
>Priority: Major
> Fix For: 1.20
>
>
> Injector uses two counters: one for the total number of injected URLs, the 
> other for the number of URLs "merged", that is already in CrawlDb. There is 
> now counter for the number of unique URLs injected which may lead to wrong 
> counts if the seed files contain duplicates:
> Suppose the following seed file which contains a duplicated URL:
> {noformat}
> $> cat seeds_with_duplicates.txt 
> https://www.example.org/page1.html
> https://www.example.org/page2.html
> https://www.example.org/page2.html
> $> $NUTCH_HOME/bin/nutch inject /tmp/crawldb seeds_with_duplicates.txt
> ...
> 2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
> rejected by filters: 0
> 2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
> injected after normalization and filtering: 3
> 2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
> injected but already in CrawlDb: 0
> 2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total new urls 
> injected: 3
> ...
> {noformat}
> However, because of the duplicated URL, only 2 URLs were injected into the 
> CrawlDb:
> {noformat}
> $> $NUTCH_HOME/bin/nutch readdb /tmp/crawldb -stats
> ...
> 2023-09-30 07:39:43,945 INFO o.a.n.c.CrawlDbReader [main] TOTAL urls:   2
> ...
> {noformat}
> If the Injector job is run again with the same input, we get the erroneous 
> output, that still one "new URL" was injected:
> {noformat}
> 2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
> rejected by filters: 0
> 2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
> injected after normalization and filtering: 3
> 2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total urls 
> injected but already in CrawlDb: 2
> 2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total new urls 
> injected: 1
> {noformat}
> This is because the urls_merged counter counts unique items, while 
> url_injected does not, and the shown number is the difference between both 
> counters.
> Adding a counter to count the number of unique injected URLs will allow to 
> get the correct count of newly injected URLs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [nutch] sebastian-nagel opened a new pull request, #783: NUTCH-3010 Injector: count unique number of injected URLs

2023-09-30 Thread via GitHub


sebastian-nagel opened a new pull request, #783:
URL: https://github.com/apache/nutch/pull/783

   - add counter urls_injected_unique
   - improve log messages reporting the counts of injected/merged URLs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (NUTCH-3010) Injector: count unique number of injected URLs

2023-09-30 Thread Sebastian Nagel (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-3010:
---
Description: 
Injector uses two counters: one for the total number of injected URLs, the 
other for the number of URLs "merged", that is already in CrawlDb. There is now 
counter for the number of unique URLs injected which may lead to wrong counts 
if the seed files contain duplicates:

Suppose the following seed file which contains a duplicated URL:
{noformat}
$> cat seeds_with_duplicates.txt 
https://www.example.org/page1.html
https://www.example.org/page2.html
https://www.example.org/page2.html

$> $NUTCH_HOME/bin/nutch inject /tmp/crawldb seeds_with_duplicates.txt
...
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
rejected by filters: 0
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected after normalization and filtering: 3
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected but already in CrawlDb: 0
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total new urls 
injected: 3
...
{noformat}
However, because of the duplicated URL, only 2 URLs were injected into the 
CrawlDb:
{noformat}
$> $NUTCH_HOME/bin/nutch readdb /tmp/crawldb -stats
...
2023-09-30 07:39:43,945 INFO o.a.n.c.CrawlDbReader [main] TOTAL urls:   2
...
{noformat}
If the Injector job is run again with the same input, we get the erroneous 
output, that still one "new URL" was injected:
{noformat}
2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
rejected by filters: 0
2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected after normalization and filtering: 3
2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected but already in CrawlDb: 2
2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total new urls 
injected: 1
{noformat}
This is because the urls_merged counter counts unique items, while url_injected 
does not, and the shown number is the difference between both counters.

Adding a counter to count the number of unique injected URLs will allow to get 
the correct count of newly injected URLs.

  was:
Injector uses two counters: one for the total number of injected URLs, the 
other for the number of URLs "merged", that is already in CrawlDb. There is now 
counter for the number of unique URLs injected which may lead to wrong counts 
if the seed files contain duplicates:

Suppose the following seed file which contains a duplicated URL:

{noformat}
$> cat seeds_with_duplicates.txt 
https://www.example.org/page1.html
https://www.example.org/page2.html
https://www.example.org/page2.html

$> $NUTCH_HOME/bin/nutch inject /tmp/crawldb seeds_with_duplicates.txt
...
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
rejected by filters: 0
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected after normalization and filtering: 3
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected but already in CrawlDb: 0
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total new urls 
injected: 3
...
{noformat}

However, because of the duplicated URL, only 2 URLs were injected into the 
CrawlDb:

{noformat}
$> $NUTCH_HOME/bin/nutch readdb /tmp/crawldb -stats
...
2023-09-30 07:39:43,945 INFO o.a.n.c.CrawlDbReader [main] TOTAL urls:   2
...
{noformat}

If the Injector job is run again with the same input, we get the erroneous 
output, that still one "new URL" was injected:

{noformat}
2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
rejected by filters: 0
2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected after normalization and filtering: 3
2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected but already in CrawlDb: 2
2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total new urls 
injected: 1
{noformat}

This is because the urls_merged counter counts unique items, while url_injected 
does not.

Adding a counter to count the number of unique injected URLs will allow to get 
the correct count of newly injected URLs.


> Injector: count unique number of injected URLs
> --
>
> Key: NUTCH-3010
> URL: https://issues.apache.org/jira/browse/NUTCH-3010
> Project: Nutch
>  Issue Type: Improvement
>  Components: injector
>Affects Versions: 1.19
>Reporter: Sebastian Nagel
>Assignee: Sebastian Nagel
>Priority: Major
> Fix For: 1.20
>
>
> Injector uses two counters: one for the total number of injected URLs, the 
> other for the number of URLs "merged", that is already in CrawlDb. There is 
> now counter for the number of 

[jira] [Created] (NUTCH-3010) Injector: count unique number of injected URLs

2023-09-30 Thread Sebastian Nagel (Jira)
Sebastian Nagel created NUTCH-3010:
--

 Summary: Injector: count unique number of injected URLs
 Key: NUTCH-3010
 URL: https://issues.apache.org/jira/browse/NUTCH-3010
 Project: Nutch
  Issue Type: Improvement
  Components: injector
Affects Versions: 1.19
Reporter: Sebastian Nagel
Assignee: Sebastian Nagel
 Fix For: 1.20


Injector uses two counters: one for the total number of injected URLs, the 
other for the number of URLs "merged", that is already in CrawlDb. There is now 
counter for the number of unique URLs injected which may lead to wrong counts 
if the seed files contain duplicates:

Suppose the following seed file which contains a duplicated URL:

{noformat}
$> cat seeds_with_duplicates.txt 
https://www.example.org/page1.html
https://www.example.org/page2.html
https://www.example.org/page2.html

$> $NUTCH_HOME/bin/nutch inject /tmp/crawldb seeds_with_duplicates.txt
...
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
rejected by filters: 0
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected after normalization and filtering: 3
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected but already in CrawlDb: 0
2023-09-30 07:38:00,185 INFO o.a.n.c.Injector [main] Injector: Total new urls 
injected: 3
...
{noformat}

However, because of the duplicated URL, only 2 URLs were injected into the 
CrawlDb:

{noformat}
$> $NUTCH_HOME/bin/nutch readdb /tmp/crawldb -stats
...
2023-09-30 07:39:43,945 INFO o.a.n.c.CrawlDbReader [main] TOTAL urls:   2
...
{noformat}

If the Injector job is run again with the same input, we get the erroneous 
output, that still one "new URL" was injected:

{noformat}
2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
rejected by filters: 0
2023-09-30 07:41:13,625 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected after normalization and filtering: 3
2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total urls 
injected but already in CrawlDb: 2
2023-09-30 07:41:13,626 INFO o.a.n.c.Injector [main] Injector: Total new urls 
injected: 1
{noformat}

This is because the urls_merged counter counts unique items, while url_injected 
does not.

Adding a counter to count the number of unique injected URLs will allow to get 
the correct count of newly injected URLs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)