What is the file you have attached? It is not safe. I don't know the format of lucene index, would you please give an example?
On Sat, Dec 25, 2010 at 12:34 AM, Black, Michael (IS) < [email protected]> wrote: > Using hadoop-0.20 > > > I'm doing custom input splits from a Lucene index. > > I want to split the document ID's across N mappers (I'm testing the > scalabilty of the problem across 4 nodes and 8 cores). > > So the key is the document# and they are not sequential. > > At this point I'm using splits.add to add each document...but that sets up > one task for every document...not something I want to do of course. > > How can I add a group of documents to each split? I found a scant > reference > to PrimeInputSplit but that doesn't seem to resolve on hadoop-0.20. > > > Michael D. Black > Senior Scientist > Nothrop Grumman Information Systems > Advanced Analytics Directorate > > > >
