Hi,
Since we know that our httpclient plugin has some problems may it is
sensefully to update to the new library,
I guess this is some work, but may someone is interested to take the
job.:)
http://www.theserverside.com/news/thread.tss?thread_id=38189
ttpClient 3.0 provides the following
Stefan Groschupf wrote:
Hi,
Since we know that our httpclient plugin has some problems may it is
sensefully to update to the new library,
I guess this is some work, but may someone is interested to take the
job.:)
I'll take it, thanks for the heads-up.
--
Best regards,
Andrzej Bialecki
Hi,
This is what i did to make NutchConf behave not so static,
without patching any of those 195 places Stefan mentioned.
NutchConf.get() yields the current config.
OpenConf sets a new current config.
finally CloseConf closes this config.
But be warned about issues with the plugin cache
[
http://issues.apache.org/jira/browse/NUTCH-148?page=comments#action_12361128 ]
Piotr Kosiorowski commented on NUTCH-148:
-
Do you have Cygwin installed?
Is 'df' working in your cygwin installation?
Do you run crawl from cygwin shell?
Nutch
outlinks not shown properly in cached.jsp
-
Key: NUTCH-149
URL: http://issues.apache.org/jira/browse/NUTCH-149
Project: Nutch
Type: Bug
Components: searcher, web gui
Versions: 0.8-dev
Environment: windows xp
[
http://issues.apache.org/jira/browse/NUTCH-149?page=comments#action_12361130 ]
raghavendra prabhu commented on NUTCH-149:
--
Do the outlinks work only when the HTML has a basetag
So that the entire link may be constructed
If not will the base
[
http://issues.apache.org/jira/browse/NUTCH-61?page=comments#action_12361131 ]
raghavendra prabhu commented on NUTCH-61:
-
Will the same thing work for a filesystem
For a file system , We can directly get the modified date store it in the db
The
[
http://issues.apache.org/jira/browse/NUTCH-61?page=comments#action_12361133 ]
Andrzej Bialecki commented on NUTCH-61:
This patch already supports this. Anyway, it needs to be significantly
re-worked to fit into the current development version.
Hi all,
It's time to do some cleanup of the trunk/ after the mapred merge. I'm
planning to remove the old classes in trunk/, from the following packages:
* org.apache.nutch.db.* - all classes
* org.apache.nutch.fetcher.*
* org.apache.nutch.indexer.IndexSegment
*
[ http://issues.apache.org/jira/browse/NUTCH-150?page=all ]
Paul Baclace updated NUTCH-150:
---
Attachment: OutlinkExtractor.java.patch
This patch has 3 changes:
1. Adds a comment that non-plain-text can be a problem.
2. Adds quantifiers to the regular
10 matches
Mail list logo