[
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erik Krogen updated HDFS-12345:
-------------------------------
Status: Patch Available (was: In Progress)
It took a while to finally get to this, but I'm happy to be attaching an
initial stab at moving Dynamometer from our GitHub repository into Hadoop
Tools! It builds, puts itself into the distribution, and the tests pass (at
least locally... will let Jenkins see if it agrees).
This is based off of the [{{ekrogen-hadoop-3-support}} branch of
Dynamometer|https://github.com/xkrogen/dynamometer/tree/ekrogen-hadoop-3-support],
which is a patch on top of the master branch changing it to support Hadoop 3.
I am thinking that a reasonable way forward may be to leave the GitHub repo as
Hadoop 2 compatible, and keep the version within Tools for Hadoop 3+.
There are still some major outstanding tasks before this can be committed:
* The documentation hasn't been placed where it belongs to work with the site
* I'm not entirely confident the packaging strategy I've used, with an overall
{{hadoop-dynamometer}} module containing the same three submodules as the
GitHub repo, is the right approach. Comments are welcomed.
* The style doesn't match Hadoop (in particular, line length is higher -- lots
of reformatting is going to need to be done)
* I'm not sure if system properties are properly passed as necessary for the
tests
* I/we need to make a decision about version compatibility. Dynamometer was
designed to be able to run multiple versions of Hadoop from a single
Dynamometer release. Does this still make sense now that Dynamometer is within
Hadoop itself? I think so, to accommodate scenarios where you have a cluster
running Hadoop version X but you want to test out what an upgrade to Hadoop
version Y might look like.
In addition to these blocker-items, I think there are a few tasks that are
well-suited to follow-on tasks:
* Currently Dynamometer always downloads a Hadoop tarball to use for tests
(caching it locally between runs), overridable by a system property. It seems
like it should probably use the local build when possible.
* As Wei-Chiu mentioned above, we need proper unit testing (there is mostly one
big integration test for now) and support for more features.
> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --------------------------------------------------------------------------
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: namenode, test
> Reporter: Zhe Zhang
> Assignee: Erik Krogen
> Priority: Major
> Attachments: HDFS-12345.000.patch
>
>
> Dynamometer has now been open sourced on our [GitHub
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as
> possible, we went through our standard open sourcing process of releasing on
> GitHub. However we are interested in the possibility of donating this to
> Apache as part of Hadoop itself and would appreciate feedback on whether or
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%[email protected]%3e]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]