This is a Hadoop benchmark suite. You can decide which benchmarks match your needs.
https://github.com/intel-hadoop/hibench (Haven't used it yet!) ----- Original Message ----- | From: "Brian Bockelman" <bbock...@cse.unl.edu> | To: common-user@hadoop.apache.org | Sent: Tuesday, October 23, 2012 4:40:04 PM | Subject: Re: measuring iops | | Hi Rita, | | I get a bit grumpy when I see IOPS as the primary metric with respect | to HDFS. | | Why? While IOPS are actually a relevant part of the system, many use | cases of HDFS are for a *throughput oriented* workflow. So, in the | traditional M/R use cases for HDFS, you likely will barely scratch | the IOPS the system provides. | | In fact, HDFS in 0.20 will create a separate TCP connection for each | IOPS - that should tell you how low random-access workflows ranked | on the HDFS designs. | | As a disclaimer, there are use cases (particularly HBase, and how I | currently use our HDFS install!) where IOPS are quite relevant. | Just recall that they are not the end-all, be-all for HDFS | performance measurement. It's not the primary number I would look | for! Each install will have their own requirements. | | Brian | | On Oct 23, 2012, at 6:01 PM, Rita <rmorgan...@gmail.com> wrote: | | > I was curious because when a vendor (big storage company) presented | > they | > were offering a hadoop solution. They posted IOPS and I wasn't sure | > how | > they were determining this number.... | > | > | > | > On Tue, Oct 23, 2012 at 9:19 AM, Michael Segel | > <michael_se...@hotmail.com>wrote: | > | >> You have two issues. | >> | >> 1) You need to know the throughput in terms of data transfer | >> between disks | >> and controller cards on the node. | >> | >> 2) The actual network throughput of having all of the nodes | >> talking to one | >> another as fast as they can. This will let you see your real | >> limitations in | >> the ToR Switch's fabric. | >> | >> Not sure why you really want to do this except to test the disk, | >> disk | >> controller, and then networking infrastructure of your ToR and | >> then your | >> backplane to connect multiple racks.... | >> | >> | >> HTH | >> | >> -Mike | >> | >> On Oct 23, 2012, at 7:47 AM, Ravi Prakash <ravi...@ymail.com> | >> wrote: | >> | >>> Do you mean in a cluster being used by users, or as a benchmark | >>> to | >> measure the maximum? | >>> | >>> The JMX page <nn:port>/jmx provides some interesting stats, but | >>> I'm not | >> sure they have what you want. And I'm unaware of other tools which | >> could. | >>> | >>> | >>> | >>> | >>> | >>> ________________________________ | >>> From: Rita <rmorgan...@gmail.com> | >>> To: common-user@hadoop.apache.org; Ravi Prakash | >>> <ravi...@ymail.com> | >>> Sent: Monday, October 22, 2012 6:46 PM | >>> Subject: Re: measuring iops | >>> | >>> Is it possible to know how many reads and writes are occurring | >>> thru the | >>> entire cluster in a consolidated manner -- this does not include | >>> replication factors. | >>> | >>> | >>> On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash | >>> <ravi...@ymail.com> | >> wrote: | >>> | >>>> Hi Rita, | >>>> | >>>> SliveTest can help you measure the number of reads / writes / | >>>> deletes / | >> ls | >>>> / appends per second your NameNode can handle. | >>>> | >>>> DFSIO can be used to help you measure the amount of throughput. | >>>> | >>>> Both these tests are actually very flexible and have a plethora | >>>> of | >> options | >>>> to help you test different facets of performance. In my | >>>> experience, you | >>>> actually have to be very careful and understand what the tests | >>>> are doing | >>>> for the results to be sensible. | >>>> | >>>> HTH | >>>> Ravi | >>>> | >>>> | >>>> | >>>> | >>>> ________________________________ | >>>> From: Rita <rmorgan...@gmail.com> | >>>> To: "<common-user@hadoop.apache.org>" | >>>> <common-user@hadoop.apache.org> | >>>> Sent: Monday, October 22, 2012 7:23 AM | >>>> Subject: Re: measuring iops | >>>> | >>>> Anyone? | >>>> | >>>> | >>>> On Sun, Oct 21, 2012 at 8:30 AM, Rita <rmorgan...@gmail.com> | >>>> wrote: | >>>> | >>>>> Hi, | >>>>> | >>>>> Was curious if there was a method to measure the total number | >>>>> of IOPS | >>>> (I/O | >>>>> operations per second) on a HDFS cluster. | >>>>> | >>>>> | >>>>> | >>>>> -- | >>>>> --- Get your facts first, then you can distort them as you | >>>>> please.-- | >>>>> | >>>> | >>>> | >>>> | >>>> -- | >>>> --- Get your facts first, then you can distort them as you | >>>> please.-- | >>>> | >>> | >>> | >>> | >>> -- | >>> --- Get your facts first, then you can distort them as you | >>> please.-- | >> | >> | > | > | > -- | > --- Get your facts first, then you can distort them as you | > please.-- | |