What file system do you have? One thing we’ve observed is that ext3, which is the default on ephemeral disks on EC2, scales very poorly to multicore workloads. We recommend reformatting those as XFS (which is very fast to format) or ext4 (which unfortunately takes a few hours to finalize). Maybe there are also other FS options that affect SSDs.
Matei On Jan 17, 2014, at 8:01 AM, Andrew Ash <and...@andrewash.com> wrote: > Are there different amounts of RAM on the SSD machines vs the Spinny disk > machines? > > Sent from my mobile phone > On Jan 17, 2014 5:22 AM, "Jay" <hja...@gmail.com> wrote: > >> OS memory cache?? >> >> Sent from my iPad. >> >>> 在 2014年1月16日,上午6:04,Chen Jin <karen...@gmail.com> 写道: >>> >>> Dear Spark developers: >>> >>> We are benchmarking spark operations such as filter, group, join on >>> ssd instance i2.2xlarge on EC2. Most operations are similar or >>> slightly better than ephemeral disks on EC2, however, the performance >>> of group operation on SDD are much worse than regular disks, at least >>> 2x to 3x worse. Could any of you shed some lights on this behavior? >>> >>> Thanks a lot, >>> >>> -chen >>