My 10x was very rough.

I based it on:

a) you want a few files per map task
b) you want a map task per core

I tend to use quad core machines and so I used 2 x 8 = 10 (roughly).

On EC2, you don't have multi-core machines (I think) so you might be fine with 
2-4 files per CPU.


-----Original Message-----
From: C G [mailto:[EMAIL PROTECTED]
Sent: Fri 8/31/2007 11:21 AM
To: hadoop-user@lucene.apache.org
Subject: RE: Compression using Hadoop...
 
> Ted, from what you are saying I should be using at least 80 files given the 
> cluster size, and I should modify the loader to be aware 
> of the number of nodes and split accordingly. Do you concur?

Reply via email to