Hi Travis, Are you using SSDs or spinning disks in your configuration?
Thanks, Colin Williams On Mon, Oct 6, 2014 at 3:09 PM, Travis <[email protected]> wrote: > For filesystem creation, we use the following with mkfs.ext4 > > mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L > $HDFS_LABEL /dev/${DEV}1 > > By default, mkfs creates way too many inodes, so we tune it a bit with the > "largefile" option, which modifies the inode_ratio. This gives us ~2 > million usable inodes on a 2TB filesystem. > > As well, by default, mkfs sets the block reserve to 5%, which wastes a > fair amount of space, since this space is only accessible to the root > user. We tune this down to 1% at mkfs time, but you can use tune2fs at > runtime to change it. > > I don't know that I would use writeback. This mode is problematic in the > event of a crash because it can allow old data to exist on the FS, but with > new metadata. I consider this corruption. Unless you know your > environment to be super stable (meaning no OS or hardware-induced crashes) > AND you have stable, UPS-backed power, I would steer clear of this. > > If you're looking for the utmost in filesystem performance, you're better > off looking at the controller card you're using. Right now, we're using > LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12 > drives in JBOD. Our older LSI-based cards can only sustain maybe a quarter > of that in the same disk configuration. > > Travis > > On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <[email protected]> > wrote: > >> Hi, >> >> I'm trying to figure out what are more ideal settings for using ext4 on >> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc >> option is chosen in the fstab. Is that still a preferred option? >> >> I read elsewhere to disable the ext4 journal, and use data=writeback. >> >> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html >> >> Finally, in some slides i read to use dir_index,sparse_super,extent when >> creating the filesystem, and mount noatime and nodiratime >> >> >> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud >> >> >> >> >> >> > > > -- > Travis Campbell > [email protected] >
