Hi,
I'm glad that I've been selected for GSoC 2016 for my proposal: Security
Layer for NutchServer (NUTCH-1756).
I'm planning to implement all new features for Nutch 2.x. I have some
questions and ideas for that. Firstly, I'm planning to link all these
issues:
[
https://issues.apache.org/jira/browse/NUTCH-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2254:
--
Assignee: Sebastian Nagel
> Charset issues when using -addBinaryContent and -base64
[
https://issues.apache.org/jira/browse/NUTCH-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256331#comment-15256331
]
ASF GitHub Bot commented on NUTCH-2254:
---
GitHub user sebastian-nagel opened a pull request:
GitHub user sebastian-nagel opened a pull request:
https://github.com/apache/nutch/pull/107
NUTCH-2254 Indexer: character set issue with -addBinaryContent and -base64
- generate base64 encoded string directly from content bytes
(based on patch provided by Federico Bonelli)
GitHub user thammegowda opened a pull request:
https://github.com/apache/nutch/pull/106
Option to include inlinks in commonscrawl dump
This PR enhances the CommonCrawlDumper with an optional CLI argument to
accept linkdb path.
When this option is supplied, an additional field
[
https://issues.apache.org/jira/browse/NUTCH-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256225#comment-15256225
]
Sebastian Nagel commented on NUTCH-2254:
Hi [~fedechicco], the patch should work. Thanks!
I'll add
6 matches
Mail list logo