kenneth mcfarland created NUTCH-2406:
Summary: Sum up constants, make minor changes
Key: NUTCH-2406
URL: https://issues.apache.org/jira/browse/NUTCH-2406
Project: Nutch
Issue Type: Improv
[
https://issues.apache.org/jira/browse/NUTCH-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118039#comment-16118039
]
ASF GitHub Bot commented on NUTCH-2406:
---
kpm1985 opened a new pull request #210: NUT
Hey currently,
we are on nutch 2.3.1 and using it to crawl our websites.
One of our focus is to get all the pdfs on our website crawled. -> Links on
different Websites are like: https://assets0.mysite.com/asset /DB_product.pdf
I tried different things:
At the configurations I removed ever occur
3 matches
Mail list logo