Hi, How do I make hadoop split its output? The program I am writing crawls a catalog tree from a single url, so initially the input contains only one entry. after a few iterations, it will have tens of thousands of urls. But what I noticed is that the file is always in one block (part-00000). What I would like to have is once the number of entries increases, it can parallelize the job. Currently it doesn't seem to be case.
-- -------------------------------------- Standing Bear Has Spoken --------------------------------------