[
https://issues.apache.org/jira/browse/HADOOP-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292910#comment-17292910
]
Mehakmeet Singh commented on HADOOP-17553:
------------------------------------------
So, few doubts I had:
* Creating JSON is also optional, right?
* We are saving a JSON at every .close() and only when debug is on?
* So, basically, we would have a map of <FileSystem, IOStats> in the
Filesystem.java, right? So, that we could relate the IOStats for the filesystem
in use in FileSystem.java?
* We would have .json for every principal and for every job we run? Would that
be a lot in terms of space?
*
{quote}extend IOStatisticsSnapshot with list of <string, string> options for
use in annotating saved logs (hostname, principal, jobID, ...). Don’t know how
to merge these.
{quote}Didn’t understand this bit. Do you mean, having some particular
attribute as a key to IOStatsSnapshot instance to check up stats regarding that
attribute?
* Rajesh's point on how/who we will clear these files as well?
> FileSystem.close() to optionally log IOStats; save to local dir
> ---------------------------------------------------------------
>
> Key: HADOOP-17553
> URL: https://issues.apache.org/jira/browse/HADOOP-17553
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs, fs/azure, fs/s3
> Affects Versions: 3.3.1
> Reporter: Steve Loughran
> Assignee: Mehakmeet Singh
> Priority: Major
>
> We could save the IOStats to a local temp dir as JSON (the snapshot is
> designed to be serializable, even has a test), with a unique name
> (iostats-stevel-s3a-bucket1-timestamp-random#.json ... etc).
> We can collect these (Rajesh can, anyway), and then
> * look for load on a specific bucket
> * look what happened at a specific time
> The best bit: the IOStatisticsSnapshot aggregates counters, min/max/mean, so
> you could merge iostats-*-s3a-bucket1-*.json to get the IOStats of all
> principals working with a given bucket
> This will be local, so low cost, low cost enough we could turn it on in
> production. All that's needed is collection of the stats from the local hosts
> (or they write to a shared mounted volume)
> We will need some "hadoop iostats merge" command to take multiple files and
> merge them all together; print to screen or save to a new file.
> Straightforward as all the load and merge code is present.
> Needs
> * logging in FS.close
> * new iostats CLI + docs, tests
> * extend IOStatisticsSnapshot with list of <string, string> options for use
> in annotating saved logs (hostname, principal, jobID, ...). Don't know how to
> merge these.
> If we are going to add a new context map to the IOStatisticsSnapshot then we
> MUST update it before 3.3.1 ships so as to avoid breaking the serialization
> format on the next release, especially the java one.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]