Hi all,

Thanks for the replies.

I'm using hadoop-0.18.3. 

We are actually indexing clueweb'09 dataset in a much-similar-to-Nutch
way. Reduce creates a document (similar to lucene document which
implements Writable) by adding the fields generated by Map. This
document is put into output collector (of Reduce). This document is
indexed and copied to local files using Nutch's
org.apache.nutch.indexer.IndexerOutputFormat (used in setOutputFormat())
which in turn uses org.apache.nutch.indexer.lucene.LuceneWriter to index
this document output by Reduce, on local file system and move the
created index to HDFS. 

I'm not getting where to place Reporter.incrCounter() to reset the
counter. Indexing is being done at the end of Reduce.


REGARDING HADOOP MAILING LIST: Whenever I try to subscribe to the Hadoop
mailing list, I get an error saying:

==
Hi. This is the qmail-send program at apache.org .
I'm afraid I wasn't able to deliver your message to the following
addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

<[email protected]>:
This mailing list has moved to common-user at hadoop.apache.org .
==

Strange thing is I'm able to send mail to the list, but I'm unable to
receive any mails on the mailing list. I didn't even receive replies to
these mails, I found them on Google!  What could be the problem?


Thanks in advance,
Prashant Ullegaddi,
Search and Information Extraction Lab,
IIIT-Hyderabad,
INDIA.




Reply via email to