> From: Bradford Stephens
> I'm banging my head against some perf issues on EC2. I'm
> using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hbase
> scripts to handle the new version.
>
> I'm trying to insert about 22G of data across nodes on EC2
> m1.large instances [...]
c1.xlarge provides (barely) adequate I/O bandwidth.
Those periods of higher latency that you mention in the part of your mail that
I clipped are probably due to hypervisor stealing your resources to attend to a
noisy neighbor with a better reservation class.
I would not consider EC2 a high performance platform, except for maybe their
cluster compute nodes which have been specially engineered for HPC using a
completely different virtualization and network architecture than the rest. EC2
is about bulk processing on a reasonable (subject to definition) timeframe at
cheap/elastic prices.
- Andy