[
https://issues.apache.org/jira/browse/NUTCH-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12566950#action_12566950
]
Emmanuel Joke commented on NUTCH-567:
-
Hi Dogacan, do you think you will commit this new
Update build.xml to include tika jar
Key: NUTCH-607
URL: https://issues.apache.org/jira/browse/NUTCH-607
Project: Nutch
Issue Type: Bug
Environment: All
Reporter: Dennis Kubes
[
https://issues.apache.org/jira/browse/NUTCH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Kubes updated NUTCH-607:
---
Attachment: NUTCH-607-1-20080208.patch
Updates build to include the tika.jar for the war. Correct
[
https://issues.apache.org/jira/browse/NUTCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Kubes updated NUTCH-606:
---
Attachment: NUTCH-606-1-20080208.patch
Refactors the generator and ensures the checks are run on all
[
https://issues.apache.org/jira/browse/NUTCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Kubes reassigned NUTCH-606:
--
Assignee: Dennis Kubes
Refactoring of Generator, run all urls through checks
[
https://issues.apache.org/jira/browse/NUTCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Kubes updated NUTCH-606:
---
Attachment: NUTCH-606-2-20080208.patch
Adds some refactoring to close file readers before exiting
Refactoring of Generator, run all urls through checks
-
Key: NUTCH-606
URL: https://issues.apache.org/jira/browse/NUTCH-606
Project: Nutch
Issue Type: Bug
Components: generator
-20080208.patch, NUTCH-606-2-20080208.patch
Refactor the generator to make sure all host run through checks such as host
and protocol checks, ip checks if necessary. Currently the generator only
does this for urls if generate.max.per.host 0 which by default is -1. So
by default all urls
Assignee: Dennis Kubes
Fix For: 1.0.0
Attachments: NUTCH-607-1-20080208.patch
Update the build.xml to include the tika jar in the war file. Currently the
jar is not included and the cached.jsp page errors out.
--
This message is automatically generated by JIRA
Upgrade nutch to use released apache-tika-0.1-incubating
Key: NUTCH-608
URL: https://issues.apache.org/jira/browse/NUTCH-608
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Kubes updated NUTCH-606:
---
Attachment: NUTCH-606-3-20080208.patch
Added an empty check for hostnames
Refactoring of Generator
11 matches
Mail list logo