be specified in arguments.
-邮件原件-
发件人: oddaniel [mailto:[EMAIL PROTECTED]
发送时间: 2008年5月5日 13:27
收件人: nutch-user@lucene.apache.org
主题: Someone Please respond ... Deleting Urls already crawled from the
crawlDB
Guys i have been trying to get this done for weeks now. No progress
Guys i have been trying to get this done for weeks now. No progress. Someone
please help me. I am trying to delete a domain already crawled from my
crawldb and index.
I have a list of domains already crawled in my index. How do I exclude or
delete domains from my crawl output folder. I have
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: oddaniel [EMAIL PROTECTED]
To: nutch-user@lucene.apache.org
Sent: Saturday, April 19, 2008 4:20:04 AM
Subject: Delete Urls from CrawlsDB
Is it possible to remove or delete one of the urls that has
Is it possible to index images with nutch? Please how can this be done. Any
article or sample code will be very helpful. Thanks.
A nudge in the right direction will be ok. Thanks.
--
View this message in context:
http://www.nabble.com/Searching-For-Images-tp16807326p16807326.html
Sent from the
Is it possible to remove or delete one of the urls that has been crawled from
the crawl database? If this is possible, how can it be done?
--
View this message in context:
http://www.nabble.com/Delete-Urls-from-CrawlsDB-tp16773512p16773512.html
Sent from the Nutch - User mailing list archive at
Hi please how can I do a Nutch search for just PDF document results only.
Thanks.
Daniel
--
View this message in context:
http://www.nabble.com/Search-for-Just-PDF-documents-tp16721681p16721681.html
Sent from the Nutch - User mailing list archive at Nabble.com.
);
.
.
.
.
oddaniel wrote:
I am trying to merge two crawl results.
1. Merge linkdbs - WORKS FINE.
2. Merge crawldbs - WORKS FINE.
For some reason I keep getting java.io.IOException: No input paths
specified in input whern trying to Merge Segments. Can Anyone please tell
me what could be causing
Please how can i merge two different crawls?
Can I do this from within my Java class? And how please?
I have searched through the forum and all i see is scripts on how to do
this. I dont have a clue how to get this done from with the actual java
code. A nudge in the right direction would be
I am trying to merge two crawl results.
1. Merge linkdbs - WORKS FINE.
2. Merge crawldbs - WORKS FINE.
For some reason I keep getting java.io.IOException: No input paths specified
in input whern trying to Merge Segments. Can Anyone please tell me what
could be causing this and how to fix it.