Yes only error "warn i get is"
mapred.FileOutputCommitter - Output path is null in cleanup
What does this mean? what would be the command line too index a single
domain. say test.com
Why does generate give me the same fetch list every time ? i thought Nutch
would only re indexed the same page once every 30 days
my setup fetch the same pages every time i index, this seems a waist of
resources.
Cheers
Shane.
On 26/03/14 06:37, d_k wrote:
Are you sure all the steps are working? Did you look at the logs?
On Tue, Mar 25, 2014 at 4:50 AM, Shane Wood<[email protected]> wrote:
I have setup Nutch Solr and MYSQL as per this how too
http://nlp.solutions.asia/?p=362
I run Nutch using these commands.
./bin/nutch inject urls
./bin/nutch generate -topN 20
./bin/nutch fetch -all
./bin/nutch parse -all
./bin/nutch updatedb
./bin/nutch solrindex http://127.0.0.1:8983/solr/ -reindex
I have a /crawl folder yet nothing appears in it while it's indexing where
does nutch
store the content etc while it's indexing ?
Is there a informative faq on what differences using MYSQL makes too your
setup.
Cheers for any help
Shane.