On Oct 29, 2007, at 2:17 PM, Jim the Standing Bear wrote:
If jobs cannot be recursively launched, does it mean that I have to pre-process all the catalogs and putting only filenames into a sequence file and then use hadoop from that point on?
As Lohit wrote, launching recursive jobs that block the current task don't work. A much better approach is to have a map/reduce job that generates the work list for the next iteration and continuing on until there are no new items to process.
-- Owen
