[jira] [Commented] (HADOOP-17553) FileSystem.close() to optionally log IOStats; save to local dir

Mehakmeet Singh (Jira) Mon, 01 Mar 2021 06:26:08 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292910#comment-17292910
 ]


Mehakmeet Singh commented on HADOOP-17553:
------------------------------------------

So, few doubts I had:
 * Creating JSON is also optional, right?
 * We are saving a JSON at every .close() and only when debug is on?
 * So, basically, we would have a map of <FileSystem, IOStats> in the 
Filesystem.java, right? So, that we could relate the IOStats for the filesystem 
in use in FileSystem.java?
 * We would have .json for every principal and for every job we run? Would that 
be a lot in terms of space?
 *  
{quote}extend IOStatisticsSnapshot with list of <string, string> options for 
use in annotating saved logs (hostname, principal, jobID, ...). Don’t know how 
to merge these.
{quote}Didn’t understand this bit. Do you mean, having some particular 
attribute as a key to IOStatsSnapshot instance to check up stats regarding that 
attribute?
 * Rajesh's point on how/who we will clear these files as well?

> FileSystem.close() to optionally log IOStats; save to local dir
> ---------------------------------------------------------------
>
>                 Key: HADOOP-17553
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17553
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.3.1
>            Reporter: Steve Loughran
>            Assignee: Mehakmeet Singh
>            Priority: Major
>
> We could save the IOStats to a local temp dir as JSON (the snapshot is 
> designed to be serializable, even has a test), with a unique name 
> (iostats-stevel-s3a-bucket1-timestamp-random#.json ... etc). 
> We can collect these (Rajesh can, anyway), and then
> * look for load on a specific bucket
> * look what happened at a specific time
> The best bit: the IOStatisticsSnapshot aggregates counters, min/max/mean, so 
> you could merge iostats-*-s3a-bucket1-*.json to get the IOStats of all 
> principals working with a given bucket
> This will be local, so low cost, low cost enough we could turn it on in 
> production. All that's needed is collection of the stats from the local hosts 
> (or they write to a shared mounted volume)
> We will need some "hadoop iostats merge" command to take multiple files and 
> merge them all together; print to screen or save to a new file. 
> Straightforward as all the load and merge code is present.
> Needs
> * logging in FS.close
> * new iostats CLI + docs, tests
> * extend IOStatisticsSnapshot with list of <string, string> options for use 
> in annotating saved logs (hostname, principal, jobID, ...). Don't know how to 
> merge these.
> If we are going to add a new context map to the IOStatisticsSnapshot then we 
> MUST update it before 3.3.1 ships so as to avoid breaking the serialization 
> format on the next release, especially the java one. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-17553) FileSystem.close() to optionally log IOStats; save to local dir

Reply via email to