[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260498#comment-15260498 ] Colin Patrick McCabe commented on HDFS-10175: - BTW, sorry for the last-minute-ness of this scheduling, [~liuml07] and [~steve_l]. Webex here at 10:30: HDFS-10175 webex Wednesday, April 27, 2016 10:30 am | Pacific Daylight Time (San Francisco, GMT-07:00) | 1 hr JOIN WEBEX MEETING https://cloudera.webex.com/cloudera/j.php?MTID=mebca25435f158dec71b2589561e71b29 Meeting number: 294 963 170 Meeting password: 1234 JOIN BY PHONE 1-650-479-3208 Call-in toll number (US/Canada) Access code: 294 963 170 Global call-in numbers: https://cloudera.webex.com/cloudera/globalcallin.php?serviceType=MC=45642173=0 > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260487#comment-15260487 ] Colin Patrick McCabe commented on HDFS-10175: - Great. Let me add a webex > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259851#comment-15259851 ] Steve Loughran commented on HDFS-10175: --- bq. One thing that is a bit concerning about metrics2 is that I think people feel that this interface should be stable (i.e. don't remove or alter things once they're in), which would be a big constraint on us. Ops teams don't like metrics they rely on being taken away; they also view published metrics as the API. As the compatibility docs say "Metrics should preserve compatibility within the major release." bq. Perhaps we could document that per-fs stats were @Public @Evolving rather than stable +1 to that, though it'll be important not to break binary compatibility with external filesystems. FWIW, issues I have with metrics2, apart from "steve doesn't understand the design fully" is its preference for singletons registered with JMX. This makes sense in deployed services, not for tests. bq. Do we have any ideas about how Spark will consume these metrics in the longer term? Spark ia a Coda Hale instrumented codebase, as are many other apps these days. Integration between hadoop metrics of any form and the Coda Hale libraries would be something to address, but not here. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259838#comment-15259838 ] Steve Loughran commented on HDFS-10175: --- Wednesday april 26, 10:30 AM PST; 18:30 UK, 17:30 GMT works for me. Webex binding? if you don't specify one, mine is at https://hortonworks.webex.com/meet/stevel > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259234#comment-15259234 ] Mingliang Liu commented on HDFS-10175: -- Either tomorrow or next week will work for me. Thanks. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259223#comment-15259223 ] Colin Patrick McCabe commented on HDFS-10175: - Hi [~steve_l], does 10:30AM work tomorrow? Unfortunately I'll be out on Thursday and most of Friday, so if we can't do tomorrow we'd have to do Friday afternoon or early next week. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258726#comment-15258726 ] Colin Patrick McCabe commented on HDFS-10175: - bq. I see that, but stream-level counters are essential at least for the tests which verify forward and lazy seeks. Which means that yes, they do have to go into the 2.8.0 release. What I can do is set up the scope so that they are package private, then, in the test code, implement the assertions about metric-derived state into that package. I guess my hope here is that whatever mechanism we come up with is something that could easily be integrated into the upcoming 2.8 release. Since we have talked about requiring our new metrics to not modify existing stable public interfaces, that seems very reasonable. One thing that is a bit concerning about metrics2 is that I think people feel that this interface should be stable (i.e. don't remove or alter things once they're in), which would be a big constraint on us. Perhaps we could document that per-fs stats were \@Public \@Evolving rather than stable? bq. Regarding the metrics2 instrumentation in HADOOP-13028, I'm aggregating the stream statistics back into the metrics 2 data. That's something which isn't needed for the hadoop tests, but which I'm logging in spark test runs, such as (formatted for readability): Do we have any ideas about how Spark will consume these metrics in the longer term? Do they prefer to go through metrics2, for example? I definitely don't object to putting this kind of stuff in metrics2, but if we go that route, we have to accept that we'll just get global (or at best per-fs-type) statistics, rather than per-fs-instance statistics. Is that acceptable? So far, nobody has spoken up strongly in favor of per-fs-instance statistics. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258706#comment-15258706 ] Colin Patrick McCabe commented on HDFS-10175: - bq. I prefer earlier (being in UK time and all); I could do the first half hour of the webex \[at 12:30pm\] How about 10:30AM PST to noon on tomorrow on Wednesday? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258012#comment-15258012 ] Steve Loughran commented on HDFS-10175: --- One piece of background here is what I'm currently exploring in terms of [Metrics-first testing|http://steveloughran.blogspot.co.uk/2016/04/distributed-testing-making-use-of.html], that is, instrumenting code and using it's observed state in both unit and system tests. I've done in in Slider (SLIDER-82) and spark (SPARK-7889) and found it highly effective for writing deterministic tests which also provide information parseable by test runners for better analysis of test runs. bq. I understand your eagerness to get the s3 stats in, but I would rather not proliferate more statistics interfaces if possible. Once they're in, we really can't get rid of them, and it becomes very confusing and clunky. I see that, but stream-level counters are essential at least for the tests which verify forward and lazy seeks. Which means that yes, they do have to go into the 2.8.0 release. What I can do is set up the scope so that they are package private, then, in the test code, implement the assertions about metric-derived state into that package. Regarding the metrics2 instrumentation in HADOOP-13028, I'm aggregating the stream statistics back into the metrics 2 data. That's something which isn't needed for the hadoop tests, but which I'm logging in spark test runs, such as (formatted for readability): {code} 2016-04-26 12:08:25,901 executor.Executor Running task 0.0 in stage 0.0 (TID 0) 2016-04-26 12:08:25,924 rdd.HadoopRDD Input split: s3a://landsat-pds/scene_list.gz:0+20430493 2016-04-26 12:08:26,107 compress.CodecPool - Got brand-new decompressor [.gz] 2016-04-26 12:08:32,304 executor.Executor Finished task 0.0 in stage 0.0 (TID 0). 2643 bytes result sent to driver 2016-04-26 12:08:32,311 scheduler.TaskSetManager Finished task 0.0 in stage 0.0 (TID 0) in 6434 ms on localhost (1/1) 2016-04-26 12:08:32,312 scheduler.TaskSchedulerImpl Removed TaskSet 0.0, whose tasks have all completed, from pool 2016-04-26 12:08:32,315 scheduler.DAGScheduler ResultStage 0 finished in 6.447 s 2016-04-26 12:08:32,319 scheduler.DAGScheduler Job 0 finished took 6.560166 s 2016-04-26 12:08:32,320 s3.S3aIOSuite size of s3a://landsat-pds/scene_list.gz = 464105 rows read in 6779125000 nS 2016-04-26 12:08:32,324 s3.S3aIOSuite Filesystem statistics S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600, enableMultiObjectsDelete=true, multiPartThreshold=2147483647, statistics { 20430493 bytes read, 0 bytes written, 3 read ops, 0 large read ops, 0 write ops}, metrics {{Context=S3AFileSystem} {FileSystemId=29890500-aed6-4eb8-bb47-0c896a66aac2-landsat-pds} {fsURI=s3a://landsat-pds/scene_list.gz} {streamOpened=1} {streamCloseOperations=1} {streamClosed=1} {streamAborted=0} {streamSeekOperations=0} {streamReadExceptions=0} {streamForwardSeekOperations=0} {streamBackwardSeekOperations=0} {streamBytesSkippedOnSeek=0} {streamBytesRead=20430493} {streamReadOperations=1488} {streamReadFullyOperations=0} {streamReadOperationsIncomplete=1488} {files_created=0} {files_copied=0} {files_copied_bytes=0} {files_deleted=0} {directories_created=0} {directories_deleted=0} {ignored_errors=0} }} {code} The spark code isn't accessing these metrics, though it could if it tried hard (went to the metric registry). It's publishing those stream level operations which I think you are most concerned about; the other metrics are roughly a subset of those already in Azure's metrics2 instrumentation. Accordingly, I will modify the S3 instrumentation to *not* register the stream operations as metrics2 counters, retaining them internally and in the toString value. I hope that's enough to satisfy your concerns while still retaining the information I need in s3a functionality and testing > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257774#comment-15257774 ] Steve Loughran commented on HDFS-10175: --- I prefer earlier (being in UK time and all); I could do the first half hour of the webex > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257099#comment-15257099 ] Mingliang Liu commented on HDFS-10175: -- Thanks for proposing the discussion. The time works for me. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257074#comment-15257074 ] Colin Patrick McCabe commented on HDFS-10175: - We already have three statistics interfaces: 1. FileSystem#Statistics 2. DFSInputStream#ReadStatistics 3. metrics2 etc. #1 existed for a very long time and is tied into MR in the ways discussed above. I didn't create it, but I did implement the thread-local optimization, based on some performance issues we were having. I have to take the blame for adding #2, in HDFS-4698. At the time, the main focus was on ensuring we were doing short-circuit reads, which didn't really fit into #1. And like you, I felt that it was "very low-level stream behavior" that was decoupled from the rest of the stats. Of course #3 has been around a while, and is used more generally than just in our storage code. I understand your eagerness to get the s3 stats in, but I would rather not proliferate more statistics interfaces if possible. Once they're in, we really can't get rid of them, and it becomes very confusing and clunky. Are you guys free for a webex on Wednesday afternoon? Maybe 12:30pm to 2pm? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256891#comment-15256891 ] Steve Loughran commented on HDFS-10175: --- yes, could have a talk. I actually thinks this is both related to and different from the s3a stuff... that is, it can go in without this, as it is currently more on some overall metrics and some very low level counters for testing stream behaviour (rather than for direct reporting, though that could come) # some JMX-able stats for each s3a instance (taken from Azure, not per-thread) # per-stream stats (volatile, merged with s3a stats above when stream is close()d # tests to verify behaviour of the FS (e.g. lazy seek doesn't open files, read-ahead skips bytes, etc) I agree about consistency; for the stream ones I've done the fields are tagged as volatile. These are merged into the JMX-layer instrumentation at close time, so sync costs are low > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256842#comment-15256842 ] Colin Patrick McCabe commented on HDFS-10175: - Thanks for commenting, [~steve_l]. It's great to see work on s3 stats. s3a has needed love for a while. Did you get a chance to look at my HDFS-10175.006.patch on this JIRA? It seems to address all of your concerns. It provides a standard API that every FileSystem can implement (not just s3, just HDFS, etc. etc.). Once we adopt jdk8, we can easily implement this API using {{java.util.concurrent.atomic.LongAdder}} if that proves to be more readable and/or efficient. bq. Don't break any existing filesystem code by adding new params to existing methods, etc. I agree. My patch doesn't add new params to any existing methods. bq. add the new code out of FileSystem I agree. That's why I separated {{StorageStatistics.java}} from {{FileSystem.java}}. {{FileContext}} should be able to use this API as well, simply by returning a {{StorageStatistics}} instance just like {{FileSystem}} does. bq. Use an int rather than an enum; lets filesystems add their own counters. I hereby reserve 0x200-0x255 for object store operations. Hmm. I'm not sure I follow. My patch identifies counters by name (string), not by an int, enum, or byte. This is necessary because different storage backends will want to track different things (s3a wants to track s3 PUTs, HDFS wants to track genstamp bump ops, etc. etc.). We should not try to create the "statistics type enum of doom" in some misguided attempt at space optimization. Consider the case of out-of-tree Filesystem implementations as well... they are not going to add entries to some enum of doom in hadoop-common. bq. public interface StatisticsSource { Mapsnapshot(); } I don't think an API that returns a map is the right approach for statistics. That map could get quite large. We already know that people love adding just one more statistic to things (and often for quite valid reasons). It's very difficult to -1 a patch just because it bloats the statistics map more. Once this API exists, the natural progression will be people adding tons and tons of new entries to it. We should be prepared for this and use an API that doesn't choke if we have tons of stats. We shouldn't have to materialize everything all the time-- an iterator approach is smarter because it can be O(1) in terms of memory, no matter how many entries we have. I also don't think we need snapshot consistency for stats. It's a heavy burden for an implementation to carry (it basically requires some kind of materialization into a map, and probably synchronization to stop the world while the materialization is going on). And there is no user demand for it... the current FileSystem#Statistics interface doesn't have it, and nobody has asked for it. It seems like you are focusing on the ability to expose new stats to our metrics2 subsystem, while this JIRA originally focused on adding metrics that MapReduce could read at the end of a job. I think these two use-cases can be covered by the same API. We should try to hammer that out (probably as a HADOOP JIRA rather than an HDFS JIRA, as well). Do you think we should have a call about this or something? I know some folks who might be interested in testing the s3 metrics stuff, if there was a reasonable API to access it. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256108#comment-15256108 ] Steve Loughran commented on HDFS-10175: --- +one more thing that would be nice would be for all classes which collect stats to implement an interface to publish those stats: {code} pubic interface StatisticsSource { Mapsnapshot(); } {code} Why? It's something which filesystems subclasses can implement, IO Streams, etc. The FSData streams would implement and pass on to wrapped streams, or return an empty map. Having a uniform mech like this would make it easy for tests to grab the before/after data, look for diffs, display counts, etc. By avoiding any enums in the returned snapshot, it'd be easy for applications that aren't built against a specific Hadoop version (Spark, etc) to grab all stats available to the app, without knowing the exact mappings, and present them to users in reports . > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255285#comment-15255285 ] Steve Loughran commented on HDFS-10175: --- # Don't break any existing filesystem code by adding new params to existing methods, etc. # add the new code out of FileSystem # Use an int rather than an enum; lets filesystems add their own counters. I hereby reserve 0x200-0x255 for object store operations. With an open int rather than an enum, the map size is dependent upon the active ops, not the possible set. An initial hashmap using the int value as key should work, maybe set the default capacity to that of the "standard" FS ops. entry creation would have to be on demand. Alternatively, do fix the #of operations at compile time, and store in an array of volatile[], so per-thread storage is 4 bytes * op * thread, lookup O(1). With the 46 opcodes in the patch, that's 184 bytes/fs/thread. Here the increment operation returns the new value of -1 for either of : no logging, no such opcode. An out of range opcode has costs of exception raising; no counters is probability and penalty of speculation prediction failure. {code} public long inc(opcode, count) { try { return counters !=null ? counters[opcode]+=count : 0; } catch(ArrayOutOfBoundsException e) { return -1; } } {code} In this situation, the #of opcodes is fixed in the hadoop version; I'll just pre-reserve some of the values for object store operations. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255237#comment-15255237 ] Steve Loughran commented on HDFS-10175: --- Yes, contention is an issue, especially against filesystems which respond fasts. But contention is not mandatory. Coda Hale counters use a {{com.codahale.metrics.LongAdder}} class which queues up addition ops under load so threads don't block: bq. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption. This class is now built in to Java 8 as {{java.util.concurrent.atomic.LongAdder}} alongside {{java.util.concurrent.atomic.LongAccumulator}}. Even with code built against Java 7, whatever is done here should be designed so that the switch to java 8 should be seamless and transparent. That is: the specific counter implementation hidden. I'd almost advocate using the coda hale one except that it would add a new dependency everywhere; it's only used in a couple of modules right now. And for trunk we may as well switch to Java 8. (life would be so much easier of volatiles implemented atomic add/inc ops the way CPUs allow) > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254503#comment-15254503 ] Colin Patrick McCabe commented on HDFS-10175: - bq. Can I also note that as the @Public @Stable FileSystem is widely subclassed, with its protected statistics field accessed in those subclasses, nobody is allowed to take it or its current methods away. Thanks. Yeah, I agree. I would like to see us get more cautious about adding new things to {{FileSystem#Statistics}}, though, since I think it's not a good match for most of the new stats we're proposing here. bq. There's no per-thread tracking, —its collecting overall stats, rather than trying to add up the cost of a single execution, which is what per-thread stuff would, presumably do. This is lower cost but still permits microbenchmark-style analysis of performance problems against S3a. It doesn't directly let you get results of a job, "34MB of data, 2000 stream aborts, 1998 backward seeks" which are the kind of things I'm curious about. Overall stats are lower cost in terms of memory consumption, and the cost to read (as opposed to update) a metric. They are higher cost in terms of the CPU consumed for each update of the metric. In particular, for applications that do a lot of stream operations from many different threads, updating an AtomicLong can become a performance bottleneck One of the points that I was making above is that I think it's appropriate for some metrics to be tracked per-thread, but for others, we probably want to use AtomicLong or similar. I would expect that anything that led to an s3 RPC would be appropriate to be tracked by an AtomicLong very easily, since the overhead of the network activity would dwarf the AtomicLong update overhead. And we should have a common interface for getting this information that MR and stats consumers can use. bq. Maybe, and this would be nice, whatever is implemented here is (a) extensible to support some duration type too, at least in parallel, The interface here supports storing durations as 64-bit numbers of milliseconds, which seems good. It is up to the implementor of the statistic to determine what the 64-bit long represents (average duration in ms, median duration in ms, number of RPCs, etc. etc.) This is similar to metrics2 and jmx, etc. where you have basic types that can be used in a few different ways. bq. and (b) could be used as a back end by both Metrics2 and Coda Hale metrics registries. That way the slightly more expensive metric systems would have access to this more raw data. Sure. The difficult question is how metrics2 hooks up to metrics which are per FS or per-stream. Should the output of metrics2 reflect the union of all existing FS and stream instances? Some applications open a very large number of streams, so it seems impractical for metrics2 to include all those streams in its output. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254489#comment-15254489 ] Steve Loughran commented on HDFS-10175: --- Can I also note that as the {{@Public @Stable FileSystem}} is widely subclassed, with its {{protected statistics}} field accessed in those subclasses, nobody is allowed to take it or its current methods away. Thanks. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254483#comment-15254483 ] Steve Loughran commented on HDFS-10175: --- In HADOOP-13028 I've just implemented something for S3A lifted out of Azure's Metrics2 integration; essentially I've got a per-FS instance set of metrics2 netrucs, which are incremented in the FS or input stream as things happen. There's no per-thread tracking, —its collecting overall stats, rather than trying to add up the cost of a single execution, which is what per-thread stuff would, presumably do. This is lower cost but still permits microbenchmark-style analysis of performance problems against S3a. It doesn't directly let you get results of a job, "34MB of data, 2000 stream aborts, 1998 backward seeks" which are the kind of things I'm curious about. I think adding some duration tracking for the blobstore ops would be good too; I used that in [[Swift|https://github.com/apache/hadoop/tree/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/util]] where it helped show that one of the public endpoints was throttling delete calls, so timing out tests. Again, that points more to the classic metrics stuff, or, even better Coda Hale histograms Maybe, and this would be nice, whatever is implemented here is (a) extensible to support some duration type too, at least in parallel, and (b) could be used as a back end by both Metrics2 and Coda Hale metrics registries. That way the slightly more expensive metric systems would have access to this more raw data. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254197#comment-15254197 ] Ming Ma commented on HDFS-10175: This is a great idea. It also addresses an issue at the consumption side. MR iterates through these counters in all FileSystems regardless they are implemented or not, which will increase the number of MR Counters causing unnecessary overhead at MR layer such as rpc, history file, webUI, etc. As part of the patch evaluation, it might be useful to try this out with MR end to end. Ideally MR doesn't need to change code each time a new method is added to HDFS ClientProtocol. If so, MR File System Counter needs to be more dynamic; maybe the more descriptive File System Counter string should come from FileSystem, etc. It might help to flush out some of the details. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252797#comment-15252797 ] Mingliang Liu commented on HDFS-10175: -- Hi [~cmccabe], I think you proposed an innovative idea for supporting the per-operation stats. I'm willing to change the current design if your idea makes more sense in the long term. I'll prepare a full patch after review your sample code. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251041#comment-15251041 ] Hadoop QA commented on HDFS-10175: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 44s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 38s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 42s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s {color} | {color:red} hadoop-common-project/hadoop-common: patch generated 29 new + 131 unchanged - 0 fixed = 160 total (was 131) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s {color} | {color:red} hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 37s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 44s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 0s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-common-project/hadoop-common | | | Should org.apache.hadoop.fs.FileSystemStorageStatistics$LongStatisticIterator be a _static_ inner class? At FileSystemStorageStatistics.java:inner class? At FileSystemStorageStatistics.java:[lines 50-74] | | JDK v1.8.0_77 Failed junit tests | hadoop.fs.TestFilterFileSystem | | |
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250807#comment-15250807 ] Colin Patrick McCabe commented on HDFS-10175: - I thought about this a little bit more, and I don't think that {{FileSystem#Statistics#StatisticsData}} is the best place to add these new statistics. There are a few reasons. Firstly, the statistics that we're interested in are inherently filesystem-specific. For HDFS, we're interested in the number of RPCs to the NameNode-- calls like primitiveCreate, getBytesWithFutureGS, or concat. For something like s3a, we're interested in how many PUT and GET requests we've done to Amazon S3. s3 doesn't even support genstamps or the concat operation. Local filesystems have their own operations which are important. Secondly, the thread-local-data mechanism is not really that appropriate for most operations. Thread-local data is a big performance win when reading or writing bytes of data from or to a stream, since most such operations don't involve making an RPC. We have big client-side buffers which mean that most reads and writes can return immediately. In constrast, operations like mkdir, rename, delete, etc. always end up making at least one RPC, since these operations cannot be buffered on the client. In that case, the CPU overhead of doing an atomic increment is negligable. But the overhead of storing all that thread-local data is significant. I think what we should do is add an API to the FileSystem and FileContext base classes, which different types of FS can implement as appropriate. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248877#comment-15248877 ] Colin Patrick McCabe commented on HDFS-10175: - I'll take a look tomorrow, [~liuml07]. Thanks for working on this. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248792#comment-15248792 ] Mingliang Liu commented on HDFS-10175: -- Hi [~cmccabe], would you kindly have a look at the current patch please? Is it OK according to our discussion? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244044#comment-15244044 ] Hadoop QA commented on HDFS-10175: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 24s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 36s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 48s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 2s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 4s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 32s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 32s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 35s {color} | {color:red} root: patch generated 1 new + 211 unchanged - 1 fixed = 212 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 42s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 54s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 9s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 31s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 36s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 59s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 26s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 25s {color} | {color:red} hadoop-hdfs in the patch failed
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243970#comment-15243970 ] Mingliang Liu commented on HDFS-10175: -- Thanks [~cmccabe] very much for the discussion. # The v5 patch is to make the detailed statistics optional. As the statistics is shared among file system objects, it's hard to enable/disable it using per-file-system config key. The patch added a new API to {{Statistics}} to enable/disable this feature according to the tradeoff between reduced cost and detailed per-op counter. The extra overhead should be avoided if the enum map is not constructed. # I filed [HADOOP-13031] to track the discussion and effort of refactoring the code that maintains rack-aware counters. Specially, I also think it's not good to expose the internal composite data structure of distance-aware bytes read. Those use cases that iterate all the distances will call {{getBytesReadByDistance(int distance)}} multiple times, which internally will trigger the aggregation among all threads statistics data multiple times. To address this, perhaps they can use the {{getData()}} to get all the statistics data at once. I reviewed the current patch iof [MAPREDUCE-6660] which employs the bytes-read-by-distance, and found it used the {{getData()}} as I expected. # Based on the current FileSystem design, many of which are HDFS specific, we see no better choice than putting them in FileSystem$Statistics, for supporting either distance-aware read counters (HDFS-specific) or per-operation-counters (many of which are HDFS specific). By now, when the detailed statistics are missing (e.g. {{S3AFileSystem#append()}}), we treat it as zero. If some operations' statistics are different, they can update the statistics accordingly (e.g. {{S3AFileSystem#rename}}) as the counters are populated in concrete file system operations. Another point is that, for existing {{readOps/writeOps}} counters, we also have similar scenario (and challenges). Will file follow-up jiras if we have specific cases to handle. # I created a new jira [HADOOP-13032] to track the effort of moving the {{Statistics}} class out of {{FileSystem}} for shorter source code and simpler class structure, thought incompatible change. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236486#comment-15236486 ] Hadoop QA commented on HDFS-10175: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 0s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 45s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 13s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 58s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 42s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 6s {color} | {color:red} root: patch generated 1 new + 211 unchanged - 1 fixed = 212 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 20s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 58s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 14s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 45s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 6s {color} | {color:red} hadoop-hdfs in the
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236410#comment-15236410 ] Colin Patrick McCabe commented on HDFS-10175: - Thanks, [~mingma]. I agree that if we are going with a map-based approach, we can simplify the implementation of HDFS-9579. This may also save a small amount of space... for example, if FileSystem threads don't have any bytes read at distance 3 or 4, they will no longer need to store space for {{bytesReadDistanceOfThreeOrFour}}. It also seems like a good idea to make the detailed statistics optional. Then clients could opt-in to paying the overheads, rather than getting the overheads imposed whether they want them or not. [~liuml07] wrote: bq. 2) Move long getBytesReadByDistance(int distance) from Statistics to StatisticsData. If the user is getting bytes read for all distances, she can call getData() and then iterate the map/array, in which case the getData() will be called only once. For cases of 1K client threads, this may save the effort of aggregation. Colin Patrick McCabe may have different comments about this? Hmm. I'm not sure I see much of a benefit to this. Users should not be iterating over internal data structures. It exposes implementation details that we don't want to expose. I also don't see how it's more efficient, since you have to iterate over it in either case. bq. For newly supported APIs, adding an entry in the map and one line of increment in the new method will make the magic done. From the point of file system APIs, its public methods are not evolving rapidly. I agree that adding new statistics is relatively easy with the current patch. My comment was more that currently, many of these statistics are HDFS-specific. Since MapReduce needs to work with alternate filesystems, it will need to handle the case where these detailed statistics are missing or slightly different. bq. Another dimension will be needed for cross-DC analysis, while based on the current use case, I don't think this dimension is heavily needed. One point is that, all the same kind of file system is sharing the statistic data among threads, regardless of the remote/local HDFS clusters. It is not the case that "all the same kind of file system is sharing the statistic data among threads." Each {{Filesystem}} instance has a separate Statistics object associated with it: {code} /** Recording statistics per a FileSystem class */ private static final MapstatisticsTable = new IdentityHashMap (); {code} So as long as you use separate FS instances for the remote FS and local FS, it should work for this use-case. This is also part of the point that I was making earlier that memory overheads add up quickly, since each new FS instance multiplies the overhead. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236382#comment-15236382 ] Mingliang Liu commented on HDFS-10175: -- Thank you very much [~mingma] for the comments. Yes NN also has audit log which tracks DFS operations. The missing DFSClient name makes it less useful for the user. Your point about the motivation and map vs. inline log implementation of [HDFS-9579] made a lot of sense to me. I see no performance benefit from map approach. I was wondering whether using a composite (e.g. enum map, array) data structure to manage the distance->bytesRead mapping makes the code simpler. 0) {{StatisticsData}} should be a bit shorter by delegating the operations to the composite data structure. 1) The {{incrementBytesReadByDistance(int distance, long newBytes)}} and {{getBytesReadByDistance(int distance)}} which switch-case all hard-code variables, may be simplified as we can set/get the bytesRead by distance directly from map/array. 2) Move {{long getBytesReadByDistance(int distance)}} from {{Statistics}} to {{StatisticsData}}. If the user is getting bytes read for all distances, she can call getData() and then iterate the map/array, in which case the getData() will be called only once. For cases of 1K client threads, this may save the effort of aggregation. [~cmccabe] may have different comments about this? For newly supported APIs, adding an entry in the map and one line of increment in the new method will make the magic done. From the point of file system APIs, its public methods are not evolving rapidly. Another dimension will be needed for cross-DC analysis, while based on the current use case, I don't think this dimension is heavily needed. One point is that, all the same kind of file system is sharing the statistic data among threads, regardless of the remote/local HDFS clusters. [~jnp] also suggested to make this feature optional/configurable. I found it hard largely because different file system object has individual configurations. As they share the statistic data by FS class type (scheme), it will be confusing if one FS disables this feature and another FS enables this features. Is there any easy approach to handling this case? p.s. the v4 patch is to address the {{HAS_NEXT}}/{{LIST_STATUS}}. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232824#comment-15232824 ] Ming Ma commented on HDFS-10175: Interesting work. * Seems like a useful feature to provide per-job counters for this. Note that NN also has the data, except nntop or audit log don't track them on per DFSClient client name basis. For the HDFS-9579 scenario, besides the per-job counters functionality, another motivation is the extra work required to track and aggregate on DN side (something we still want to experiment to track hot block, or remote read or write for all HDFS apps, not just MR). * Question about using inline long or map object. The original patch in HDFS-9579 used map as I tried to make it more dynamic in two aspects; no need to hard code network distance or memory allocation for FileSystems that don't generate such data. But the small extra memory for static longs and mostly static network distance definition make map approach less interesting. * Growth of the metrics. It seems each time we add a RPC method, we need to add another entry. Also if someone wants to track cross-DC mkdir and intra-DC mkdir, then you added another dimension to it. * Making it optional is a good idea. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229596#comment-15229596 ] Mingliang Liu commented on HDFS-10175: -- Thanks for your test to quantify the overhead of the per-op stats map. The data you got is similar to our offline analysis. As to the goal that the ~1.5K overhead per FileSystem/thread is trading for, I think the basic idea is to provide file system operation summary in a fine-grained granularity. Before this, we have FS operations counters as read ops, write ops, and largeReadOps etc, which are not very useful for offline load analysis. One simple use case is: one directory is polluted by 1K small files because a job forgets to delete the temporary files after using them. The {{create/delete}} counters will help users/admins locate the bad job very easily. WriteOps is not very indicative as it counts many other operations. I believe [~hitesh] can show us more examples if you find his comment above not quite clear. As the statistics is used by other file systems (e.g. S3A) besides HDFS, the per-operation counters can be supported by those file systems. FileSystem itself has several high-level operations that are not supported by all concrete file systems, in which case a zero counter value for an unsupported operation seems OK. Maybe [~jnp] has more comments about this problem. Yes the {{HAS_NEXT}} is not very a metric that is interesting to users. I'll update the patch for iterative {{listStatus}}. This is a very good catch. In the very early stage of the [HDFS-9579], it used a map (perhaps because it's more straightforward). I think the idea of moving hard-coded statistics longs into maps in this case is good. I'll file new jira to address this. Ping [~mingma] for more input. By the way, I think it's good to move the {{Statistics}} class out of {{FileSystem}}. However, it's incompatible as the usage of "import" should be updated in upper applications. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222967#comment-15222967 ] Jitendra Nath Pandey commented on HDFS-10175: - [~cmccabe], Are you ok to withdraw your "-1" because [~hitesh] has explained the use case and the memory overhead seems minimal? One more option is to add a configuration parameter at the client side so that per-operation statistics can be disabled if client doesn't want to track them. [~liuml07], I believe it should be easy to add a configuration to disable the per-op stats? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217401#comment-15217401 ] Jitendra Nath Pandey commented on HDFS-10175: - Each entry in the map should be around 24 bytes (8 bytes for key enum pointer, 8 bytes for value object pointer and 8 bytes for the long counter). The EnumMap uses arrays unlike a hashmap, and enum keys would be singleton. Therefore overheads are negligible. For 48 entries it is 1.1KB per FS, and even for 1000 threads it is only 1.1MB. Triple that and its 3.3 MB. For java processes that create 1000 threads this is trivial amount of memory. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216566#comment-15216566 ] Colin Patrick McCabe commented on HDFS-10175: - It's not a map with 48 entries, it's a map with 48 entries per thread per FS. So if you have 1000 threads, that's 48,000 entries. Double or triple that if you have more than one FS (for example, if you have a local filesystem plus an HDFS filesystem). > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216509#comment-15216509 ] Hitesh Shah commented on HDFS-10175: [~cmccabe] For cluster admins who keep tabs on Hive queries or Pig scripts to see what kind of ops they are doing are using the job counters today. Unfortunately, the current job counters - read ops, write ops. etc are quite old and obsolete and are very high level and cannot be used to find bad jobs which say create 1000s of files ( for partitions, etc ). The approach Jitendra and Minglian seem to be proposing seems to integrate well with Mapreduce, etc where counters could be created to map to the filesystem stats. Additionally, these counters would be available for historical analysis as needed via the job history. This is something which would likely be needed for all jobs. To some extent I do agree that all the 50-odd metrics being tracked may not be useful for all use-cases. However, enabling this via htrace and turning on htrace for all jobs is also not an option as it does not allow any selective tracking of only certain info. Htrace being enabled for a certain job means too indepth profiling which is not really needed unless necessary for very specific profiling. Do you have any suggestions on the app layers above HDFS can obtain more in-depth stats but at the same time have all of these stats be accessible for all jobs? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215270#comment-15215270 ] Jitendra Nath Pandey commented on HDFS-10175: - This proposal is very simple and lightweight to collect statistics for various file system operations, as compared to HTrace. HTrace is significantly more complicated to setup and operate. The counters in mapreduce have been in use for a long time and are very useful to gather job level information, which is critical to analyze job behaviors. The proposed map has only 48 entries. Being an enum-map it will be significantly optimized. I think this is a very simple patch and a good low hanging fruit for more detailed analysis of job behaviors. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215187#comment-15215187 ] Colin Patrick McCabe commented on HDFS-10175: - I'm still -1 on this change. v3 creates a very large map per FileSystem per thread. For an application with lots of threads, this overhead is unacceptable. HTrace seems like a better way to get this information. I don't see a clear use-case for this. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215055#comment-15215055 ] Jitendra Nath Pandey commented on HDFS-10175: - [~cmccabe], and [~andrew.wang] the latest patch addresses the concerns about synchronization. Do you think it is ok to commit now? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211694#comment-15211694 ] Hadoop QA commented on HDFS-10175: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 21s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 31s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 21s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 3s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s {color} | {color:red} root: patch generated 1 new + 211 unchanged - 1 fixed = 212 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 1s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 58m 52s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 16s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 38s {color} | {color:red} hadoop-hdfs in the patch failed
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210877#comment-15210877 ] Colin Patrick McCabe commented on HDFS-10175: - -1. As [~andrew.wang] mentioned, this change imposes heavy synchronization overheads on everyone. It would be better to gather this sort of information through HTrace so that it could be turned on only for applications being traced. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209960#comment-15209960 ] Hadoop QA commented on HDFS-10175: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s {color} | {color:green} root: patch generated 0 new + 211 unchanged - 1 fixed = 211 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 29s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 20s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 47s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 17s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 6s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 1s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 9s {color} | {color:red} hadoop-hdfs in the patch failed
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205910#comment-15205910 ] Hadoop QA commented on HDFS-10175: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 56s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 20s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 40s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s {color} | {color:red} root: patch generated 2 new + 211 unchanged - 1 fixed = 213 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 14s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 33s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 31s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 9s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 50s {color} | {color:red} hadoop-hdfs in the patch failed with
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205086#comment-15205086 ] Mingliang Liu commented on HDFS-10175: -- Thanks for your comment, [~andrew.wang]. I was aware of the thread local statistics data structure, and was in favor of following the same approach. The new operation map is still per-thread. The ConcurrentHashMap was used because when aggregating, we have to make sure the map should not be modified. It's functionality is similar to the "volatile" keyword for other primitive statistic data. Anyway, I will revise the code and will update the patch if ConcurrentHashMap turns out unnecessary, for the sake of performance. Before that, the next patch will firstly resolve the conflicts from trunk because of [HDFS-9579]. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200335#comment-15200335 ] Andrew Wang commented on HDFS-10175: Hi [~liuml07], we spent some effort optimizing the Statistics data structure to make it per-thread to avoid synchronization overheads, it showed up for apps like Spark that do multithreaded FileSystem access. I noticed this change simply uses a ConcurrentHashMap. Can we please stick with the per-thread counters and aggregation style? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)