On Oct 29, 2007, at 2:17 PM, Jim the Standing Bear wrote:

If jobs cannot be recursively launched,
does it mean that I have to pre-process all the catalogs and putting
only filenames into a sequence file and then use hadoop from that
point on?

As Lohit wrote, launching recursive jobs that block the current task don't work. A much better approach is to have a map/reduce job that generates the work list for the next iteration and continuing on until there are no new items to process.

-- Owen

Reply via email to