On Mar 29, 2007, at 12:07 PM, Doug Cutting wrote:
Nigel Daley wrote:
So shouldn't fixing this test to conform to the new model in
HADOOP-1134 be the concern of the patch for HADOOP-1134?
Yes, but, as it stands, this patch would silently stop working
correctly once HADOOP-1134 is committed. It should instead be
written in a more robust way, that can survive expected changes.
Relying on HDFS using ChecksumFileSystem isn't as reliable as an
explicit constructor that says "I want an unchecksummed FileSystem."
Ya, that's fine. I have no problem's changing the way the patch is
implemented.
As it stand, I can't run NNBench at scale without using a raw file
system, which is what this patch is intended to allow.
It seems strange to disable things in an undocumented and
unsupported way in order to get a benchmark to complete. How does
that prove scalability? Rather, leaving NNBench alone seems like a
strong argument for implementing HADOOP-1134 sooner.
As you realized below, the test was using raw methods before
HADOOP-928. I don't understand your reference to "undocumented" and
"unsupported", but I'm not sure it matters.
Still, if you want to be able to disable checksums, for benchmarks
or whatever, we can permit that, but should do so explicitly.
HADOOP-928 caused this test to use a ChecksumFileSystem and
subsequently we saw our "read" TPS metric plummet from 20,000 to a
couple hundred.
Ah, NNBench used the 'raw' methods before, which was kind of sneaky
on its part, since it didn't benchmark the typical user experience.
Although the namenode performance should only halve at worst with
checksums as currently implemented, no?
One of the design goals of the test is to remove the effects of
DataNodes as much as possible since this is a NameNode benchmark.
That's why we used the raw methods (therefore no crc's). We run it
with 1 byte files with 1 byte blocks with a replication factor of 1,
all designed to maximize the load on the NameNode and minimize the
effects of the DataNodes.
Let's get our current benchmark back on track before we commit
HADOOP-1134 (which will likely take a while before it is "Patch
Available").
I'd argue that we should fix the benchmark to accurately reflect
what users see, so that we see real improvement when HADOOP-1134 is
committed. That would make it a more useful and realistic
benchmark. However if you believe that a checksum-free benchmark is
still useful, I think it should be more future-proof.
I think this is the crux of the misunderstanding. This is a NameNode
benchmark, not a DataNode benchmark nor a system benchmark. It
attempts to measure the TPS that are possible in the extreme.
I think you want a different kind of benchmark, which is fair. It's
just not this benchmark.
Cheers,
Nige