This is a Hadoop benchmark suite. You can decide which benchmarks match your 
needs.

https://github.com/intel-hadoop/hibench

(Haven't used it yet!)

----- Original Message -----
| From: "Brian Bockelman" <bbock...@cse.unl.edu>
| To: common-user@hadoop.apache.org
| Sent: Tuesday, October 23, 2012 4:40:04 PM
| Subject: Re: measuring iops
| 
| Hi Rita,
| 
| I get a bit grumpy when I see IOPS as the primary metric with respect
| to HDFS.
| 
| Why?  While IOPS are actually a relevant part of the system, many use
| cases of HDFS are for a *throughput oriented* workflow.  So, in the
| traditional M/R use cases for HDFS, you likely will barely scratch
| the IOPS the system provides.
| 
| In fact, HDFS in 0.20 will create a separate TCP connection for each
| IOPS - that should tell you how low random-access workflows ranked
| on the HDFS designs.
| 
| As a disclaimer, there are use cases (particularly HBase, and how I
| currently use our HDFS install!) where IOPS are quite relevant.
|  Just recall that they are not the end-all, be-all for HDFS
| performance measurement.  It's not the primary number I would look
| for!  Each install will have their own requirements.
| 
| Brian
| 
| On Oct 23, 2012, at 6:01 PM, Rita <rmorgan...@gmail.com> wrote:
| 
| > I was curious because when a vendor (big storage company) presented
| > they
| > were offering a hadoop solution. They posted IOPS and I wasn't sure
| > how
| > they were determining this number....
| > 
| > 
| > 
| > On Tue, Oct 23, 2012 at 9:19 AM, Michael Segel
| > <michael_se...@hotmail.com>wrote:
| > 
| >> You have two issues.
| >> 
| >> 1) You need to know the throughput in terms of data transfer
| >> between disks
| >> and controller cards on the node.
| >> 
| >> 2) The actual network throughput of having all of the nodes
| >> talking to one
| >> another as fast as they can. This will let you see your real
| >> limitations in
| >> the ToR Switch's fabric.
| >> 
| >> Not sure why you really want to do this except to test the disk,
| >> disk
| >> controller, and then networking infrastructure of your ToR and
| >> then your
| >> backplane to connect multiple racks....
| >> 
| >> 
| >> HTH
| >> 
| >> -Mike
| >> 
| >> On Oct 23, 2012, at 7:47 AM, Ravi Prakash <ravi...@ymail.com>
| >> wrote:
| >> 
| >>> Do you mean in a cluster being used by users, or as a benchmark
| >>> to
| >> measure the maximum?
| >>> 
| >>> The JMX page <nn:port>/jmx provides some interesting stats, but
| >>> I'm not
| >> sure they have what you want. And I'm unaware of other tools which
| >> could.
| >>> 
| >>> 
| >>> 
| >>> 
| >>> 
| >>> ________________________________
| >>> From: Rita <rmorgan...@gmail.com>
| >>> To: common-user@hadoop.apache.org; Ravi Prakash
| >>> <ravi...@ymail.com>
| >>> Sent: Monday, October 22, 2012 6:46 PM
| >>> Subject: Re: measuring iops
| >>> 
| >>> Is it possible to know how many reads and writes are occurring
| >>> thru the
| >>> entire cluster in a consolidated manner -- this does not include
| >>> replication factors.
| >>> 
| >>> 
| >>> On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash
| >>> <ravi...@ymail.com>
| >> wrote:
| >>> 
| >>>> Hi Rita,
| >>>> 
| >>>> SliveTest can help you measure the number of reads / writes /
| >>>> deletes /
| >> ls
| >>>> / appends per second your NameNode can handle.
| >>>> 
| >>>> DFSIO can be used to help you measure the amount of throughput.
| >>>> 
| >>>> Both these tests are actually very flexible and have a plethora
| >>>> of
| >> options
| >>>> to help you test different facets of performance. In my
| >>>> experience, you
| >>>> actually have to be very careful and understand what the tests
| >>>> are doing
| >>>> for the results to be sensible.
| >>>> 
| >>>> HTH
| >>>> Ravi
| >>>> 
| >>>> 
| >>>> 
| >>>> 
| >>>> ________________________________
| >>>>  From: Rita <rmorgan...@gmail.com>
| >>>> To: "<common-user@hadoop.apache.org>"
| >>>> <common-user@hadoop.apache.org>
| >>>> Sent: Monday, October 22, 2012 7:23 AM
| >>>> Subject: Re: measuring iops
| >>>> 
| >>>> Anyone?
| >>>> 
| >>>> 
| >>>> On Sun, Oct 21, 2012 at 8:30 AM, Rita <rmorgan...@gmail.com>
| >>>> wrote:
| >>>> 
| >>>>> Hi,
| >>>>> 
| >>>>> Was curious if there was a method to measure the total number
| >>>>> of IOPS
| >>>> (I/O
| >>>>> operations per second) on a HDFS cluster.
| >>>>> 
| >>>>> 
| >>>>> 
| >>>>> --
| >>>>> --- Get your facts first, then you can distort them as you
| >>>>> please.--
| >>>>> 
| >>>> 
| >>>> 
| >>>> 
| >>>> --
| >>>> --- Get your facts first, then you can distort them as you
| >>>> please.--
| >>>> 
| >>> 
| >>> 
| >>> 
| >>> --
| >>> --- Get your facts first, then you can distort them as you
| >>> please.--
| >> 
| >> 
| > 
| > 
| > --
| > --- Get your facts first, then you can distort them as you
| > please.--
| 
| 

Reply via email to