I was once tried to measure/report them on http://wiki.apache.org/hadoop/DataProcessingBenchmarks. I decided to stop because I just can't find time to do them. If you/anyone have an experience with hadoop, please report to that page. :)
/Edward On Thu, Sep 18, 2008 at 7:25 PM, Naama Kraus <[EMAIL PROTECTED]> wrote: > Hi, > > I am looking for information in the area of Hadoop tracing, instrumentation, > benchmarking and so forth. > What utilities exist ? What's their maturity? Where can I get more info > about them ? > > I am curious about statistics on Hadoop behavior (per a typical workload ? > different workloads ?). I am thinking on various metrics such as - > Percentage of time a Hadoop job spends on the various phases (map, sort & > shuffle, reduce), on I/O, network, framework execution time, user code > execution time ... > Known bottlenecks ? > And whatever else interesting statistics. > > Has anyone already measured ? Any documented statistics out there ? > > I already encountered various stuff like the X-trace based tracing tool from > Berkeley, Hadoop metrics API, Hadoop instrumentation API (HADOOP-3772), > Hadoop Vaidya (HADOOP-4179), gridmix benchmark. > > Does anyone have an input on any of those ? > Anything else I missed ? > > Thanks for any direction, > Naama > > -- > oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo > 00 oo 00 oo > "If you want your children to be intelligent, read them fairy tales. If you > want them to be more intelligent, read them more fairy tales." (Albert > Einstein) > -- Best regards, Edward J. Yoon [EMAIL PROTECTED] http://blog.udanax.org
