Re: Custom input split

蔡超 Sat, 25 Dec 2010 08:32:58 -0800

What is the file you have attached? It is not safe.

I don't know the format of lucene index, would you please give an example?



On Sat, Dec 25, 2010 at 12:34 AM, Black, Michael (IS) <
[email protected]> wrote:

> Using hadoop-0.20
>
>
> I'm doing custom input splits from a Lucene index.
>
> I want to split the document ID's across N mappers (I'm testing the
> scalabilty of the problem across 4 nodes and 8 cores).
>
> So the key is the document# and they are not sequential.
>
> At this point I'm using splits.add to add each document...but that sets up
> one task for every document...not something I want to do of course.
>
> How can I add a group of documents to each split?  I found a scant
> reference
> to PrimeInputSplit but that doesn't seem to resolve on hadoop-0.20.
>
>
> Michael D. Black
> Senior Scientist
> Nothrop Grumman Information Systems
> Advanced Analytics Directorate
>
>
>
>

Re: Custom input split

Reply via email to