Andrew McNabb wrote:
I'm looking at the Reporter interface, and I would like to verify my
understanding of what it is.  It appears to me that Reporter.setStatus()
is called periodically during an operation to give a human-readable
description of how far the progress is so far.  Is that correct?

Yes.  These strings appear in the web interface and in logs.

Reporter also has another function, to tell the MapReduce system that things are not hung, that progress is still being made. If an individual operation (map, reduce, close) may take longer than the task timeout (10 minutes by default?) then this should be called or the task will be assumed to be hung and it will be killed.

If so, is there a reason that RecordWriter.close() requires a Reporter
(are there situations where it takes a long time)?

Some reduce processes (e.g., Lucene indexing) write to temporary local files and then copy their final output to NDFS on close.

Also, is there a
standard "NullReporter" class for situations where updating is not
needed?

A NullReporter would be easy to define, but I'm not sure why you ask since Reporter's are not usually created by user code but rather by the MapReduce system.

Doug


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to