Hi everybody,
Can someone shine a light on NUTCH-124:
RobotRulesParser.java doesn't follow redirects when requesting the
robots.txt file. Doug patched this, but that didn't make it to the
trunk.
What is the wished behavior here?
For example, when requesting the following url:
[
https://issues.apache.org/jira/browse/NUTCH-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683772#action_12683772
]
Edwin Chu commented on NUTCH-702:
-
I have encountered OutOfMemoryError in CrawlDBReducer
On Thu, Mar 19, 2009 at 23:46, Sami Siren ssi...@gmail.com wrote:
Sami Siren wrote:
Andrzej Bialecki wrote:
How about the following: we build just 2 packages:
* binary: this includes only base hadoop libs in lib/ (enough to start a
local job, no optional filesystems etc), the *.job and
[
https://issues.apache.org/jira/browse/NUTCH-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683813#action_12683813
]
Doğacan Güney commented on NUTCH-728:
-
Is there a particular reason that repository is
[
https://issues.apache.org/jira/browse/NUTCH-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683814#action_12683814
]
Sami Siren commented on NUTCH-728:
--
not really, it just happens to be the mirror I use.
[
https://issues.apache.org/jira/browse/NUTCH-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683831#action_12683831
]
Doğacan Güney commented on NUTCH-728:
-
OK. I tested it, it works fine.
+1
Improve
Hi there,
I want to configure nutch on Eclipse.
Can you plz help me that how can I do so? From where can I download the
code, jar files etc.
Thanks,
Sherjeel.
Sherjeel Niazi pisze:
Hi there,
I want to configure nutch on Eclipse.
Can you plz help me that how can I do so? From where can I download
the code, jar files etc.
Thanks,
Sherjeel.
Windows or linux ?
I am working on Windows.
Sherjeel Niazi pisze:
I am working on Windows.
Ok, so you have to download:
cygwin: http://www.cygwin.com/setup.exe
nutch (from trunk)
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/758/artifact/trunk/build/nutch-2009-03-20_04-01-47.tar.gz
Install cygwin and set PATH variable for it.
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The following page has been changed by BartoszGadzimski:
http://wiki.apache.org/nutch/RunNutchInEclipse0%2e9
The comment on the change is:
added description for Windows users
Hi
I have configured my eclipse project as stated here
http://wiki.apache.org/nutch/RunNutchInEclipse0.9
Still, I am getting the following errors:
- The return type is incompatible with Parser.getParse(Content)
RTFParseFactory.java
Check out my blog :
http://j2eewebsearch.blogspot.com/
Check out the third point...
Let me know if you you get it all right. Your comments will be appreciated.
Regards,
Ninad
On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. rre...@corbitecso.comwrote:
Hi
I have configured my eclipse
13 matches
Mail list logo